Home   Help Search Login Register  

Author Topic: fast string concatenation  (Read 2017 times)

0 Members and 1 Guest are viewing this topic.

Offline Denisko-Redisko

  • Members
  • *
fast string concatenation
« on: 19 Sep 2009, 09:57:36 »
Recently discovered that copying large chunks of config  to the clipboard causes the computer to hang.
It turned out that the cause was a slow string concatenation. I had to recollect old tricks )).
If you need to merge quickly a considerable quantity of string-lines this code is useful to you:
Code: [Select]
_arr = [];

for "_i" from 0 to 40000 do {
     _arr set [count _arr, "(" + (str _i) + ")" ];
};

_funcStupidJoin = { // bruteforce
    private "_str";
    _str = "";
    {
        _str = _str + _x;
    } foreach _this;
    _str;
};

_funcJoinString = {
    private "_binJoin";
    _binJoin = {
        private ["_size", "_j"];
        if( (count _this) % 2 != 0)then{
            _this set [count _this, ""]
        };
        _size = count _this / 2;
        for "_i" from 0 to _size - 1 do {
            _j = _i + _i;
            _this set [_i, (_this select _j) + (_this select (_j+1))]
        };
        _this resize _size;
    };
    while { count _this > 1 } do { call _binJoin; };
    _this select 0;
};


_arr call _funcJoinString;    // ~= 2..3 sec
//_arr call _funcStupidJoin;     // ~= 20 sec
« Last Edit: 19 Sep 2009, 10:11:23 by Denisko-Redisko »
sorry for my english

Offline Mandoble

  • Former Staff
  • ****
    • Grunt ONE and MandoMissile suite
Re: fast string concatenation
« Reply #1 on: 19 Sep 2009, 10:38:46 »
Nice old trick and nice piece of code  :good:

Offline Denisko-Redisko

  • Members
  • *
Re: fast string concatenation
« Reply #2 on: 19 Sep 2009, 15:41:46 »
Mandoble, thanx, man  :)
And that it was classical join:
Code: [Select]
#define arg(X) (_this select (X))

_funcJoinString = {

    private ["_list", "_char", "_binJoin"];

    _list = arg(0);
    _char = arg(1);

    if( count _list < 1 ) exitwith {""};

    _binJoin = {
        private ["_size", "_subsize", "_oversize", "_j"];
        _size = count _list / 2;
        _subsize = floor _size;
        _oversize = ceil _size;
        _j = 0;
        for "_i" from 0 to _subsize - 1 do {
            _list set [_i, (_list select _j) + _char + (_list select (_j+1))];
            _j = _j + 2;
        };
        if( _subsize != _oversize )then {
            _list set [_j/2, _list select _j];
        };
        _list resize _oversize;
    };

    while { count _list > 1 } do { call _binJoin; };

    _list select 0;

};

[_stringList, ", "] call _funcJoinString;
sorry for my english

Offline Mandoble

  • Former Staff
  • ****
    • Grunt ONE and MandoMissile suite
Re: fast string concatenation
« Reply #3 on: 19 Sep 2009, 15:58:39 »
This scripting style is pure art  :good:

Offline Worldeater

  • Former Staff
  • ****
  • Suum cuique
Re: fast string concatenation
« Reply #4 on: 19 Sep 2009, 17:29:18 »
Great, thanks for sharing! :clap:

This reminds me of implementing a fast string search algorithm (Boyer-Moore or derivates come to mind).

BTW: Be careful when using exitWith. According to Suma "it is not guaranteed to work reliably" when used outside loops. I don't know the right way to exit a function but something like >>scopename "foo"; ...; breakout "foo";<< should do the trick.
try { return true; } finally { return false; }

Offline Denisko-Redisko

  • Members
  • *
Re: fast string concatenation
« Reply #5 on: 20 Sep 2009, 09:37:28 »
Worldeater, thanks :)
Quote
Be careful when using exitWith
Yes, i know:
Code: [Select]
_res = ({
    if( _x == _val )exitwith{ _x }; -1
} foreach [1,2,3,4,5])
in the case when _val is 4 then _res will be 4
in the case when _val is 987 then _res will be -1

I do not like scopename, breakout, they have awful syntax, and prefer to use in such cases, try {throw} catch {}, although it is not semantic.

PS.
Generally speaking.
Quote
Boyer-Moore
Purely algorithmically, it executed almost as many operations just as would bruteforce. All the stuff in the way we are working with the memory inside the interpreter.The whole trick is how it working with the memory inside the interpreter. And if BIS optimise work with memory it is a trick will cease to work. But is better they would add in sqf-language the join-command.

if you put the counter in the place where the concatenation, we can see that for an array of 30000 elements will be executed 30004 concatenation. but it works faster. magic? :) This is a trick, not some sort of Big O algorithm ;)

-------
edited, oh my english  :weeping:
-------
Continued.

Wow, I just learned about the existence of command diag_tickTime  
So, now we can make more precise measurements:

Code: [Select]
#define inc(N) (call { N = N + 1; N })
#define arg(V) (_this select (V))
#define pushTo(array) call { array set [count array, _this] }

_testArray = [];

for "_i" from 1 to _this do {
    "(" + (str _i) + ")" pushTo(_testArray);
};

_funcStupidJoin = { // bruteforce
    private "_str";
    _str = "";
    {
        _str = _str + _x;
        inc(_counter);
    } foreach arg(0);
    _str;
};

_funcJoinString = {

    private ["_list", "_glue", "_size", "_subsize", "_oversize", "_j"];

    _list = arg(0);
    _glue = arg(1);

    if( count _list < 1 ) exitwith {""};

    while { count _list > 1 } do {
        _size = count _list / 2;
        _subsize = floor _size;
        _oversize = ceil _size;
        _j = 0;
        for "_i" from 0 to _subsize - 1 do {
            inc(_counter);
            _list set [_i, (_list select _j) + _glue + (_list select (_j+1))];
            _j = _j + 2;
        };
        if( _subsize != _oversize )then {
            inc(_counter);
            _list set [_j/2, _list select _j];
        };
        _list resize _oversize;
    };

    _list select 0;
};

_text = [];

{
    _func = call compile _x;
    _arr = +_testArray;
    _counter = 0;
    _start = diag_tickTime;
    [_arr, ", "] call _func;
    format ["<t size='1.2'>%1</t><br />time: %2<br />count: %3", _x, diag_tickTime - _start, _counter] pushTo(_text)
} foreach ["_funcJoinString", "_funcStupidJoin"];

parseText ([_text, "<br />--------<br />"] call _funcJoinString) call {
    hint _this;
    _this
};

Code: [Select]
1000 call compile preprocessFile "testStringJoin.sqf"Results:
Code: [Select]
100
_funcJoinString time: 0.00390625 count: 102
_funcStupidJoin time: 0          count: 100

10000
_funcJoinString time: 0.324219   count: 10005
_funcStupidJoin time: 1.39063    count: 10000

20000
_funcJoinString time: 0.878906   count: 20005
_funcStupidJoin time: 7.74219    count: 20000

30000
_funcJoinString time: 0.96875    count: 30004
_funcStupidJoin time: 11.2813    count: 30000

40000
_funcJoinString time: 1.42188    count: 40005
_funcStupidJoin time: 26.8477    count: 40000

50000
_funcJoinString time: 1.78516    count: 50006
_funcStupidJoin time: 44.9297    count: 50000

60000
_funcJoinString time: 1.97266    count: 60004
_funcStupidJoin time: 65.3164    count: 60000
« Last Edit: 23 Sep 2009, 00:58:42 by Denisko-Redisko »
sorry for my english

Offline mikey

  • Former Staff
  • ****
  • Whitespace Whore
Re: fast string concatenation
« Reply #6 on: 23 Sep 2009, 14:16:43 »
Those are some impressive benchmark numbers there  :good: