I am not much good with Excel, but I encounter it a lot as data source. AutoIt is not too shabby for fast for-the-task programs that handle arrays.
So I had looked into automated way to read data from Excel sheet into AutoIt array.
Native way
AutoIt has bundled UDF for work with Excel. It is very easy to use and gets work done in just a few lines of code. Reading sheet of data into array would be like:
#include <Excel.au3>
$oExcel = _ExcelBookOpen("fileName.xls")
_ExcelSheetActivate($oExcel,"sheetName")
$result=_ExcelReadSheetToArray($oExcel)
_ExcelBookClose($oExcel)
Open book, open sheet, get data. Straightforward but in reality it can be very slow. For sheets with tens of thousands rows it can take well over minute (or few).
Workaround
I had tried to work around it and turns out simply copying sheet into clipboard and processing from there into array takes much less time (at the price of writing some code).
It is broken into several functions.
Func _StringCount($string, $substring)
Local $i, $count = 0
For $i = 1 To StringLen($string)
If StringMid($string, $i, StringLen($substring)) = $substring Then
$count = $count + 1
EndIf
Next
Return $count
EndFunc
This one I had simply looked up on AutoIt forums. It counts number of string occurrences in another string. Used later to determine number columns by counting delimiter (TAB) that Excel inserts into table when it is copied to clipboard.
Func ExcelSheetToClip($excel, $sheet)
_ExcelSheetActivate($excel, $sheet)
ClipPut("")
Send("^{HOME}^a^c")
Do
Sleep(100)
Until ClipGet()
Return ClipGet()
EndFunc
Function automates getting sheet into clipboard. Takes Excel object and sheet name for parameters and then:
- switches to sheet;
- clears clipboard (I had Excel throwing up occasional error otherwise);
- sends series of hotkeys:
- Ctrl+Home to go on first cell and clear selection;
- Ctrl+A to select all, Excel will ignore whitespace;
- Ctrl+C to copy it;
- waits for data to properly get in clipboard;
- returns that data.
Func String2DSplit($string,$rowDelimiter=@CRLF,$columnDelimiter=" ")
$lines = StringSplit(StringStripWS($string, 3), $rowDelimiter, 1)
$columnsNum = _StringCount($lines[1], $columnDelimiter) + 1
Dim $result[$lines[0]][$columnsNum] = [[0]]
For $i = 1 To $lines[0]
$columns = StringSplit($lines[$i], $columnDelimiter)
For $j = 1 To $columns[0]
$result[$i - 1][$j - 1] = $columns[$j]
Next
Next
Return $result
EndFunc
This one I had also picked up from forums, but reworked. It turns giant string into actual two-dimensional array:
- splits string into array of lines;
- counts number of delimiter occurrence in first line to determine number of columns;
- defines empty array with dimension equal to amount of lines and columns;
- loops through each line, splits it into values and fills large array;
- returns array when done.
And just a simple wrapper function to bring two main pieces together:
Func ReadSheet($excel, $sheet)
Return String2DSplit(ExcelSheetToClip($excel, $sheet))
EndFunc
Required some additional code and plenty of polish, but resulting usage is hardly complicated:
#include <Excel.au3>
$oExcel = _ExcelBookOpen("fileName.xls")
$result = ReadSheet($oExcel, "sheetName")
_ExcelBookClose($oExcel)
And about 50-60 times faster for sheets I tried it on.
Overall
Using native Windows paths is often more convenient and bulletproof. But for specific cases bit of specialized code can give program excellent speed boost.
Script https://www.rarst.net/script/excelclip.au3
PS method also turned out useful to get data out of spoiled file Excel file (overflowing with weird embedded objects for some reason).
Rodney #
Rarst #
mobin #
AJ #