U++ Core Tutorial :: Ultimate++

Logging is a useful technique to trace the flow of the code and examine results. In this tutorial we will be using logging extensively, so let us start tutorial with the explanation of logging.

In debug mode and with default settings, macro LOG puts string into output log file. Log file is placed into 'config-directory', which by default is .exe directory in Win32 and ~/.upp/appname in POSIX.

In TheIDE, you can access the log using 'Debug'/'View the log file Alt+L'.

LOG("Hello world");

Hello world

You can log values of various types, as long as they have AsString function defined You can chain values in single LOG using operator<<:

int x = 123;

LOG("Value of x is " << x);

Value of x is 123

As it is very common to log a value of single variable, DUMP macro provides a useful shortcut, creating a log line with the variable name and value:

DUMP(x);

x = 123

To get the value in hexadecimal code, you can use LOGHEX / DUMPHEX

DUMPHEX(x);

String h = "foo";

DUMPHEX(h);

x = 0x7b

h = Memory at 0x0x7fff74a8f428, size 0x3 = 3

+0 0x00007FFF74A8F428 66 6F 6F foo

To log the value of a container (or generic Range), you can either use normal LOG / DUMP:

Vector<int> v = { 1, 2, 3 };

DUMP(v);

v = [1, 2, 3]

or you can use DUMPC for multi-line output:

DUMPC(v);

[0] = 1

[1] = 2

[2] = 3

For maps, use DUMPM:

VectorMap<int, String> map = { { 1, "one" }, { 2, "two" } };

DUMP(map);

map = {1: one, 2: two}

DUMPM(map);

map:

[0] = (1) one

[1] = (2) two

All normal LOGs are removed in release mode. If you need to log things in release mode, you need to use LOG/`DUMP` variant with 'R' prefix (RLOG, RDUMP, RDUMPHEX...):

RLOG("This will be logged in release mode too!");

This will be logged in release mode too!

Sort of opposite situation is when adding temporary LOGs to the code for debugging. In that case, 'D' prefixed variants (DLOG, DDUMP, DDUMPHEX...) are handy - these cause compile error in release mode, so will not get forgotten in the code past the release:

DLOG("This would not compile in release mode.");

This would not compile in release mode.

The last flavor of LOG you can encounter while reading U++ sources is the one prefixed with 'L'. This one is not actually defined in U++ library and is just a convention. On the start of file, there is usually something like:

#define LLOG(x) // DLOG(x)

and by uncommenting the body part, you can activate the logging in that particular file.

While logging to .log file is default, there are various ways how to affect logging, for example following line adjusts logging to output the log both to the console and .log file:

StdLogSetup(LOG_COUT|LOG_FILE);

1.2 String

String is a value type useful for storing text or binary data.

String a = "Hello";

DUMP(a);

a = Hello

You can concatenate it with another String or literal:

a = a + " world";

DUMP(a);

a = Hello world

Or single character or specified number of characters from another String or literal:

a.Cat('!');

DUMP(a);

a = Hello world!

a.Cat("ABCDEFGHIJKLM", 3);

DUMP(a);

a = Hello world!ABC

Clear method empties the String:

a.Clear();

DUMP(a);

a =

You can use operator<< to append to existing String. Non-string values are converted to appropriate String representation (using standard function AsString, whose default template definition calls ToString method for value):

for(int i = 0; i < 10; i++)

a << i << ", ";

DUMP(a);

a = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,

Sometimes is is useful to use operator<< to produce a temporary String value (e.g. as real argument to function call):

String b = String() << "Number is " << 123 << ".";

DUMP(b);

b = Number is 123.

String provides many various methods for obtaining character count, inserting characters into String or removing them:

a = "0123456789";

DUMP(a.GetCount());

a.GetCount() = 10

DUMP(a.GetLength()); // GetLength is a synonym of GetCount

a.GetLength() = 10

a.Insert(6, "<inserted>");

DUMP(a);

a = 012345<inserted>6789

a.Remove(2, 2);

DUMP(a);

a = 0145<inserted>6789

as well as searching and comparing methods:

DUMP(a.Find('e'));

DUMP(a.ReverseFind('e'));

a.Find('e') = 8

a.ReverseFind('e') = 11

DUMP(a.Find("ins"));

a.Find("ins") = 5

DUMP(a.StartsWith("ABC"));

DUMP(a.StartsWith("01"));

DUMP(a.EndsWith("89"));

a.StartsWith("ABC") = false

a.StartsWith("01") = true

a.EndsWith("89") = true

You can get slice of String using Mid method; with single parameter it provides slice to the end of String:

DUMP(a.Mid(3, 3));

DUMP(a.Mid(3));

a.Mid(3, 3) = 5<i

a.Mid(3) = 5<inserted>6789

You can also trim the length of String using Trim (this is faster than using any other method):

a.Trim(4);

DUMP(a);

a = 0145

You can obtain integer values of individual characters using operator[]:

DUMP(a[0]);

a[0] = 48

or the value of first character using operator* (note that if GetCount() == 0, this will return zero terminator):

DUMP(*a);

*a = 48

a.Clear();

DUMP(*a);

*a = 0

String has implicit cast to zero terminated const char *ptr (only valid as long as String does not mutate:

a = "1234";

const char *s = a;

while(*s)

LOG(*s++);

String also has standard begin end methods, which e.g. allows for C++11 for:

for(char ch : a)

LOG(ch);

It is absolutely OK and common to use String for storing binary data, including zeroes:

a.Cat(0);

DUMPHEX(a);

a = Memory at 0x0x7fff74a8f3c0, size 0x5 = 5

+0 0x00007FFF74A8F3C0 31 32 33 34 00 1234.

1.3 StringBuffer

If you need a direct write access to String's C-string character buffer, you can use complementary StringBuffer class. One of reasons to do so is when you have to deal with some C-API functions that expects to write directly to char * and you would like that result converted to the String:

void CApiFunction(char *c)

{

strcpy(c, "Hello");

}

StringBuffer b;

b.SetLength(200);

CApiFunction(b);

b.Strlen();

String x = b;

DUMP(x);

x = Hello

In this case, SetLength creates a C array of 200 characters. You can then call C-API function. Later you set the real length using Strlen - this function performs strlen of buffer and sets the length accordingly. Later you simply assign the StringBuffer to String. Note that for performance reasons, this operation clears the StringBuffer content (operation is fast and does not depend on the number of characters).

Another usage scenario of StringBuffer is altering existing String:

b = x;

b[1] = 'a';

x = b;

DUMP(x);

x = Hallo

Similar to assigning StringBuffer to String, assigning String to StringBuffer clears the source String.

StringBuffer also provides appending operations:

b = x;

b.Cat('!');

x = b;

DUMP(x);

x = Hallo!

Note that sometimes when creating some String from a lot of single characters, using StringBuffer for the operation is slightly faster then using String directly.

1.4 WString

String works with 8 bit characters. For 16-bit character encoding use WString. Both classes are closely related and share most of interface methods. U++ also provides conversions between String and WString and you can also use 8 bit string literals with WString. Conversion is ruled by current default character set. Default value of default character set is CHARSET_UTF8. This conversion is also used in WString::ToString, e.g. when putting WString to log:

WString x = "characters 280-300: "; // you can assign 8-bit character literal to WString

for(int i = 280; i < 300; i++)

x.Cat(i);

DUMP(x);

x = characters 280-300: ĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪī

ToString converts WString to String:

String y = x.ToString();

DUMP(y);

y = characters 280-300: ĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪī

ToWString converts String to WString:

y.Cat(" (appended)"); // you can use 8-bit character literals in most WString operations

x = y.ToWString();

DUMP(x);

x = characters 280-300: ĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪī (appended)

1.5 Date and Time

To represent date and time, U++ provides Date and Time concrete types.

Date date = GetSysDate();

DUMP(date);

date = 09/08/2019

All data members of Date structure are public:

DUMP((int)date.year); // we need to cast to int because some date members

DUMP((int)date.month); // are of unsigned character type which would log

DUMP((int)date.day); // as characters

(int)date.year = 2019

(int)date.month = 9

(int)date.day = 8

Dates can be compared:

DUMP(date > Date(2000, 1, 1));

date > Date(2000, 1, 1) = true

Adding a number to Date adds a number of days to it, incrementing/decrementing goes to the next/previous day:

DUMP(date + 1);

DUMP(--date);

DUMP(++date);

date + 1 = 09/09/2019

--date = 09/07/2019

++date = 09/08/2019

Subtraction of dates yields a number of days between them:

DUMP(date - Date(2000, 1, 1));

date - Date(2000, 1, 1) = 7190

There are several Date and calendar related functions:

DUMP(IsLeapYear(2012));

DUMP(IsLeapYear(2014));

DUMP(IsLeapYear(2015));

DUMP(IsLeapYear(2016));

DUMP(IsLeapYear(2017));

IsLeapYear(2012) = true

IsLeapYear(2014) = false

IsLeapYear(2015) = false

IsLeapYear(2016) = true

IsLeapYear(2017) = false

DUMP(GetDaysOfMonth(2, 2015));

DUMP(GetDaysOfMonth(2, 2016));

GetDaysOfMonth(2, 2015) = 28

GetDaysOfMonth(2, 2016) = 29

DUMP(DayOfWeek(date)); // 0 is Sunday

DayOfWeek(date) = 0

DUMP(LastDayOfMonth(date));

DUMP(FirstDayOfMonth(date));

DUMP(LastDayOfYear(date));

DUMP(FirstDayOfYear(date));

DUMP(DayOfYear(date)); // number of days since Jan-1 + 1

DUMP(DayOfYear(Date(2016, 1, 1)));

LastDayOfMonth(date) = 09/30/2019

FirstDayOfMonth(date) = 09/01/2019

LastDayOfYear(date) = 12/31/2019

FirstDayOfYear(date) = 01/01/2019

DayOfYear(date) = 251

DayOfYear(Date(2016, 1, 1)) = 1

DUMP(AddMonths(date, 20));

DUMP(GetMonths(date, date + 100)); // number of 'whole months' between two dates

DUMP(GetMonthsP(date, date + 100)); // number of 'whole or partial months' between two dates

DUMP(AddYears(date, 2));

AddMonths(date, 20) = 05/08/2021

GetMonths(date, date + 100) = 3

GetMonthsP(date, date + 100) = 4

AddYears(date, 2) = 09/08/2021

DUMP(GetWeekDate(2015, 1));

int year;

DUMP(GetWeek(Date(2016, 1, 1), year)); // first day of year can belong to previous year

DUMP(year);

GetWeekDate(2015, 1) = 12/29/2014

GetWeek(Date(2016, 1, 1), year) = 53

year = 2015

DUMP(EasterDay(2015));

DUMP(EasterDay(2016));

EasterDay(2015) = 04/05/2015

EasterDay(2016) = 03/27/2016

U++ defines the beginning and the end of era, most algorithms can safely assume that as minimal and maximal values Date can represent:

DUMP(Date::Low());

DUMP(Date::High());

Date::Low() = 01/01/-4000

Date::High() = 01/01/4000

Time is derived from Date, adding members to represent time:

Time time = GetSysTime();

DUMP(time);

DUMP((Date)time);

DUMP((int)time.hour);

DUMP((int)time.minute);

DUMP((int)time.second);

time = 09/08/2019 22:06:53

(Date)time = 09/08/2019

(int)time.hour = 22

(int)time.minute = 6

(int)time.second = 53

Times can be compared:

DUMP(time > Time(1970, 0, 0));

time > Time(1970, 0, 0) = true

Warning: As Time is derived from the Date, most operations automatically convert Time back to Date. You have to use ToTime conversion function to convert Date to Time:

DUMP(time > date); // time gets converted to Date...

DUMP(time > ToTime(date));

time > date = false

time > ToTime(date) = true

Like Date, Time supports add and subtract operations, but numbers represent seconds (using int64 datatype):

DUMP(time + 1);

DUMP(time + 24 * 3600);

DUMP(time - date); // time converts to Date, so the result is in days

DUMP(time - ToTime(date)); // Time - Time is in seconds

time + 1 = 09/08/2019 22:06:54

time + 24 * 3600 = 09/09/2019 22:06:53

time - date = 0

time - ToTime(date) = 79613

Time defines era limits too:

DUMP(Time::Low());

DUMP(Time::High());

Time::Low() = 01/01/-4000 00:00:00

Time::High() = 01/01/4000 00:00:00

1.6 AsString, ToString and operator<<

U++ Core provides simple yet effective standard schema for converting values to default textual form. System is based on the combination of template functions (following code is part of U++ library):

namespace Upp {

template <class T>

inline String AsString(const T& x)

{

return x.ToString();

}

template <class T>

inline Stream& operator<<(Stream& s, const T& x)

{

s << AsString(x);

return s;

}

template <class T>

inline String& operator<<(String& s, const T& x)

{

s.Cat(AsString(x));

return s;

}

};

Client types have to either define String ToString method or specialize AsString template in Upp namespace. Such types can be appended to Streams or Strings using operator<<. Of course, U++ value types and primitive types have required items predefined by U++:

FileOut fout(ConfigFile("test.txt"));

String sout;

fout << 1.23 << ' ' << GetSysDate() << ' ' << GetSysTime();

sout << 1.23 << ' ' << GetSysDate() << ' ' << GetSysTime();

fout.Close();

DUMP(LoadFile(ConfigFile("test.txt")));

DUMP(sout);

LoadFile(ConfigFile("test.txt")) = 1.23 09/08/2019 09/08/2019 22:06:53

sout = 1.23 09/08/2019 09/08/2019 22:06:53

Getting client types involved into this schema is not too difficult, all you need to do is to add ToString method:

struct BinFoo {

int x;

String ToString() const { return FormatIntBase(x, 2); }

};

BinFoo bf;

bf.x = 30;

sout.Clear();

sout << bf;

DUMP(sout);

sout = 11110

If you cannot add ToString, you can still specialize template in Upp namespace:

struct RomanFoo {

int x;

RomanFoo(int x) : x(x) {}

};

namespace Upp {

template <> String Upp::AsString(const RomanFoo& a) { return FormatIntRoman(a.x); }

};

1.7 CombineHash

To simplify providing of high quality hash codes for composite types, U++ provides CombineHash utility class. This class uses GetHashValue function to gather hash codes of all values and combines them to provide final hash value for composite type:

struct Foo {

String a;

int b;

unsigned GetHashValue() const { return CombineHash(a, b); }

};

Note that GetHashValue is defined as function template that calls GetHashValue method of its argument, therefore defining GetHashValue method defines GetHashValue function too:

Foo x;

x.a = "world";

x.b = 22;

DUMP(GetHashValue(x));

GetHashValue(x) = 749369797

x.a << '!';

DUMP(GetHashValue(x));

GetHashValue(x) = 1076203717

1.8 SgnCompare and CombineCompare

Traditional approach of C language of representing comparison results was 3-state: comparing a and b results in negative value (if a < b), zero (if a == b) or positive value (a > b). In C++ standard library, comparisons are usually represented with bool predicates.

However, with bool predicate it becomes somewhat more difficult to provide comparisons for composite types:

struct Foo {

String a;

int b;

int c;

// we want to order Foo instances by a first, then b, then c

bool operator<(const Foo& x) const {

return a < x.a ? true

: a == x.a ? b < x.b ? true

: b == x.b ? false

: c < x.c

: false;

}

};

U++ provides standard function SgnCompare, which returns negative value/zero/positive in "C style":

int a = 1;

int b = 2;

DUMP(SgnCompare(a, b));

DUMP(SgnCompare(b, a));

DUMP(SgnCompare(a, a));

SgnCompare(a, b) = -1

SgnCompare(b, a) = 1

SgnCompare(a, a) = 0

Default implementation of SgnCompare calls Compare method of value:

struct MyClass {

int val;

int Compare(const MyClass& x) const { return SgnCompare(val, x.val); }

};

SgnCompare is now defined for MyClass:

MyClass u, v;

u.val = 1;

v.val = 2;

DUMP(SgnCompare(u, v));

DUMP(SgnCompare(v, u));

DUMP(SgnCompare(v, v));

SgnCompare(u, v) = -1

SgnCompare(v, u) = 1

SgnCompare(v, v) = 0

Now getting back to Foo, with SgnCompare operator< becomes much less difficult:

struct Foo2 {

String a;

int b;

int c;

bool operator<(const Foo2& x) const {

int q = SgnCompare(a, x.a);

if(q) return q < 0;

q = SgnCompare(b, x.b);

if(q) return q < 0;

q = SgnCompare(c, x.c);

return q < 0;

}

};

Alternatively, it is possible to define just Compare method and use Comparable CRTP idiom to define all relation operators:

struct Foo3 : Comparable<Foo3> {

String a;

int b;

int c;

int Compare(const Foo3& x) const {

int q = SgnCompare(a, x.a);

if(q) return q;

q = SgnCompare(b, x.b);

if(q) return q;

return SgnCompare(c, x.c);

}

};

Foo3 m, n;

m.a = "A";

m.b = 1;

m.c = 2;

n.a = "A";

n.b = 1;

n.c = 3;

DUMP(m < n);

DUMP(m == n);

DUMP(m != n);

DUMP(SgnCompare(m, n));

m < n = true

m == n = false

m != n = true

SgnCompare(m, n) = -1

While the content of Compare method is trivial, it can be further simplified using CombineCompare helper class:

struct Foo4 : Comparable<Foo4> {

String a;

int b;

int c;

int Compare(const Foo4& x) const {

return CombineCompare(a, x.a)(b, x.b)(c, x.c);

}

};

Foo4 o, p;

o.a = "A";

o.b = 1;

o.c = 2;

p.a = "A";

p.b = 1;

p.c = 3;

DUMP(o < p);

DUMP(o == p);

DUMP(o != p);

DUMP(SgnCompare(o, p));

o < p = true

o == p = false

o != p = true

SgnCompare(o, p) = -1

2. Streams

2.1 Streams basics

U++ stream working with files is FileStream. It has 3 derived classes, FileIn, FileOut and FileAppend, for the most common uses.

FileIn in(GetDataFile("test.txt"));

if(!in) {

LOG("Failed to open the file");

return;

}

The most basic operations of streams are Put and Get. Get works in the same ways as good old C getc - it returns negative number on eof or error:

String h;

int c;

while((c = in.Get()) >= 0)

h.Cat(c);

DUMP(h);

h = Lorem ipsum dolor sit amet, consectetur adipiscing elit,

sed do eiusmod tempor incididunt ut labore et dolore magna

aliqua. Ut enim ad minim veniam, quis nostrud exercitation

ullamco laboris nisi ut aliquip ex ea commodo consequat.

Duis aute irure dolor in reprehenderit in voluptate velit

esse cillum dolore eu fugiat nulla pariatur. Excepteur

sint occaecat cupidatat non proident, sunt in culpa qui

officia deserunt mollit anim id est laborum.

U++ streams provide no formatting capabilities (that is deferred to text utilities), but they have some unique features. U++ does not distinguish between 'text' and 'binary' mode streams, methods are well suited to work with both in common mode.

GetLine returns String of single line read (lines separator being '\n', '\r' is ignored):

in.Seek(0);

while(!in.IsEof())

DUMP(in.GetLine());

in.GetLine() = Lorem ipsum dolor sit amet, consectetur adipiscing elit,

in.GetLine() = sed do eiusmod tempor incididunt ut labore et dolore magna

in.GetLine() = aliqua. Ut enim ad minim veniam, quis nostrud exercitation

in.GetLine() = ullamco laboris nisi ut aliquip ex ea commodo consequat.

in.GetLine() = Duis aute irure dolor in reprehenderit in voluptate velit

in.GetLine() = esse cillum dolore eu fugiat nulla pariatur. Excepteur

in.GetLine() = sint occaecat cupidatat non proident, sunt in culpa qui

in.GetLine() = officia deserunt mollit anim id est laborum.

Peek can be used to look at the next character without actually moving on to the next one:

in.Seek(0);

DDUMP((char)in.Peek());

DDUMP(in.GetLine());

(char)in.Peek() = L

in.GetLine() = Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Get method reads at most specified number of bytes from the stream and returns them as String:

in.Seek(0);

DUMP(in.Get(10));

in.Get(10) = Lorem ipsu

If there is not enough characters in the Stream as required by Get, everything till EOF is returned:

in.Seek(0);

DUMP(in.Get(999999).GetCount());

in.Get(999999).GetCount() = 452

In contrast, GetAll method fails when there is not enough characters in the Stream and returns Void String if Stream is not in LoadThrowing mode:

in.Seek(0);

h = in.GetAll(100);

DUMP(h.GetCount());

h.GetCount() = 100

h = in.GetAll(999999);

DUMP(h.IsVoid());

h.IsVoid() = true

In LoadThrowing mode, Stream throws LoadingError exception when there is problem with input Stream:

in.LoadThrowing();

try {

in.GetAll(999999);

}

catch(LoadingError) {

LOG("Loading error");

}

Loading error

Template variant of Stream::operator<< is using AsString to convert data to text:

String fn = GetHomeDirFile("test.txt");

FileOut out(fn);

if(!out) {

LOG("Failed to open the file");

return;

}

out << "Some number " << 321 << " and Point " << Point(1, 2);

out.Close();

When writing to the Stream, the good way to check for errors is to write all data, close the stream and then check for IsError:

if(out.IsError()) { // check whether file was properly written

LOG("Error");

return;

}

DUMP(LoadFile(fn));

LoadFile(fn) = Some number 321 and Point [1, 2]

FileAppend can be used to append data to the file:

FileAppend out2(fn);

out2 << "\nSomething more";

out2.Close();

DUMP(LoadFile(fn));

LoadFile(fn) = Some number 321 and Point [1, 2]

Something more

Important and often used type of Stream is StringStream which works with String as input/output.

Stream also provides methods to store/load primitive types, in both little-endian and big-endian modes:

StringStream ss;

ss.Put32le(0x12345678);

ss.Put32be(0x12345678);

DUMPHEX(ss.GetResult());

ss.GetResult() = Memory at 0x0x7fff74a8efc0, size 0x8 = 8

+0 0x00007FFF74A8EFC0 78 56 34 12 12 34 56 78 xV4..4Vx

StringStream ss2(ss.GetResult());

DUMPHEX(ss2.Get32le());

DUMPHEX(ss2.Get32be());

ss2.Get32le() = 0x12345678

ss2.Get32be() = 0x12345678

2.2 Special streams

SizeStream counts the number of bytes written to the stream:

SizeStream szs;

szs << "1234567";

DUMP(szs.GetSize());

szs.GetSize() = 7

CompareStream can be used to compare the content of some stream with data written to CompareStream:

StringStream in("123456");

CompareStream cs(in);

cs.Put("12345");

DUMP(cs.IsEqual());

cs.IsEqual() = true

cs.Put("7");

DUMP(cs.IsEqual());

cs.IsEqual() = false

OutStream buffers output data to bigger blocks, then outputs them via Out virtual method:

struct MyOutStream : OutStream {

virtual void Out(const void *data, dword size) {

DUMPHEX(String((const char *)data, size));

}

};

MyOutStream os;

os << "This is a test " << 12345;

os.Close();

String((const char *)data, size) = Memory at 0x0x7f8910991d20, size 0x14 = 20

+0 0x00007F8910991D20 54 68 69 73 20 69 73 20 61 20 74 65 73 74 20 31 This is a test 1

+16 0x00007F8910991D30 32 33 34 35 2345

TeeStream sends output data to two separate streams:

StringStream ss1;

StringStream ss2;

TeeStream tee(ss1, ss2);

tee << "Tee stream test";

tee.Close();

DUMP(ss1.GetResult());

DUMP(ss2.GetResult());

ss1.GetResult() = Tee stream test

ss2.GetResult() = Tee stream test

MemReadStream can be used to convert read-only memory block to stream data:

static const char s[] = "Some line\nAnother line";

MemReadStream ms(s, sizeof(s) - 1);

while(!ms.IsEof())

DUMPHEX(ms.GetLine());

ms.GetLine() = Memory at 0x0x7fff74a8ec58, size 0x9 = 9

+0 0x00007FFF74A8EC58 53 6F 6D 65 20 6C 69 6E 65 Some line

ms.GetLine() = Memory at 0x0x7fff74a8ec58, size 0xC = 12

+0 0x00007FFF74A8EC58 41 6E 6F 74 68 65 72 20 6C 69 6E 65 Another line

2.3 Binary serialization

Serialization is a mechanism that converts structured data to/from binary stream. In U++, loading and storing of data is performed by single code, in most cases represented by method Serialize. Serialization is performed directly with basic Stream. To this end, Stream features a single boolean representing the direction of serialization process. The direction can be checked using IsLoading and IsStoring methods and changed with SetStoring and SetLoading methods. Direction is usually set properly by derived classes (e.g. FileOut sets it to storing, FileIn to loading).

Shortcut to calling Serialize method is operator%, which is templated overload that calls Serialize for given variable (primitive types have direct overload in Stream class):

StringStream ss;

int x = 123;

Color h = White();

ss % x % h;

StringStream ss2(ss.GetResult());

int x2;

Color h2;

ss2 % x2 % h2;

DUMP(x2);

DUMP(h2);

x2 = 123

h2 = Color(255, 255, 255)

When serialization fails to load the data (e.g. because of wrong structure or not enough data in the stream), Stream::LoadError is invoked, which can trigger the exception if the stream is LoadThrowing:

ss2.Seek(0);

ss2.LoadThrowing();

try {

ss2 % x2 % h2 % x2;

}

catch(LoadingError) {

LOG("Deserialization has failed");

}

Deserialization has failed

Examples so far serve mostly like basic demonstration of serialization. In practice, the implementation is usually represented by Serialize method of class that is to be compatible with this concept. To that end, it is a good idea to provide means for future expansion of such class:

struct MyFoo {

int number;

Color color;

void Serialize(Stream& s) {

int version = 0;

s / version; // allow backward compatibility in the future

s.Magic(31415); // put magic number into the stream to check for invalid data

s % number % color;

}

};

MyFoo foo;

foo.number = 321;

foo.color = Blue();

StoreAsFile, StoreAsString, LoadFromFile and LoadFromString are convenience functions that simplify storing / loading objects to / from the most common forms of storage:

String data = StoreAsString(foo);

MyFoo foo2;

LoadFromString(foo2, data);

DUMP(foo2.number);

DUMP(foo2.color);

foo2.number = 321

foo2.color = Color(0, 0, 128)

Now if MyFoo was to be extended to MyFoo2 and we wanted to maintain the ability to load it from binary data stored by original MyFoo, we can branch on previously stored version:

struct MyFoo2 {

int number;

Color color;

String text;

void Serialize(Stream& s) {

int version = 1;

s / version;

s % number % color;

if(version >= 1)

s % text;

}

};

MyFoo2 foo3;

LoadFromString(foo3, data);

DUMP(foo3.number);

DUMP(foo3.color);

foo3.number = 12582924

foo3.color = Color(Null)

Note: operator/ is Stream method with several overloads optimized for small value - in this case int is stored as single byte if possible (and as 5 bytes if not).

3. Array containers

3.1 Vector basics

Vector is the basic container of U++. It is the random access container similar to std::vector with one important performance related difference: There are rules for elements of Vector that allow its implementation to move elements in memory using plain memcpy/`memmove` ("Moveable" concept).

Anyway, for now let us start with simple Vector of ints:

Vector<int> v;

You can add elements to the Vector as parameters of the Add method

v.Add(1);

v.Add(2);

DUMP(v);

v = [1, 2]

Alternative and very important possibility for U++ containers is 'in-place creation'. In this case, parameter-less Add returns a reference to a new element in Vector:

v.Add() = 3;

DUMP(v);

v = [1, 2, 3]

You can also use operator<<

v << 4 << 5;

DUMP(v);

v = [1, 2, 3, 4, 5]

Vector also supports initializer lists:

v.Append({ 6, 7 });

DUMP(v);

v = [1, 2, 3, 4, 5, 6, 7]

To iterate Vector you can use indices:

for(int i = 0; i < v.GetCount(); i++)

LOG(v[i]);

begin/end interface:

for(auto q = v.begin(), e = v.end(); q != e; q++)

LOG(*q);

C++11 range-for syntax:

for(const auto& q : v)

LOG(q);

3.2 Vector operations

You can Insert or Remove elements at random positions of Vector (O(n) complexity):

Vector<int> v;

v.Add(1);

v.Add(2);

v.Insert(1, 10);

DUMP(v);

v = [1, 10, 2]

v.Insert(0, { 7, 6, 5 });

DUMP(v);

v = [7, 6, 5, 1, 10, 2]

v.Remove(0);

DUMP(v);

v = [6, 5, 1, 10, 2]

At method returns element at specified position ensuring that such a position exists. If there is not enough elements in Vector, required number of elements is added. If second parameter of At is present, newly added elements are initialized to this value.

v.Clear();

for(int i = 0; i < 10000; i++)

v.At(Random(10), 0)++;

DUMP(v);

v = [1002, 996, 1063, 1001, 969, 955, 1013, 985, 956, 1060]

Referencing invalid index is undefined operation. Sometimes however it is useful to return the element value if index is valid and some default value if it is not. This can be achieved with two parameter Get method:

DUMP(v.Get(4, 0));

DUMP(v.Get(-10, 0));

DUMP(v.Get(13, -1));

v.Get(4, 0) = 969

v.Get(-10, 0) = 0

v.Get(13, -1) = -1

You can apply algorithms on containers, e.g. Sort

Sort(v);

DUMP(v);

v = [955, 956, 969, 985, 996, 1001, 1002, 1013, 1060, 1063]

3.3 Transfer issues

Often you need to pass content of one container to another of the same type. U++ containers always support pick semantics (synonym of std::move), and, depending on type stored, also might support clone semantics. When transferring the value, you have to explicitly specify which one to use:

Vector<int> v{ 1, 2 };

DUMP(v);

Vector<int> v1 = pick(v);

DUMP(v);

DUMP(v1);

v = [1, 2]

v = []

v1 = [1, 2]

now source Vector v is empty, as elements were 'picked' to v1.

If you really need to preserve value of source (and elements support deep copy operation), you can use clone:

v = clone(v1);

DUMP(v);

DUMP(v1);

v = [1, 2]

v1 = [1, 2]

The requirement of explicit clone has the advantage of avoiding unexpected deep copies. For example:

Vector<Vector<int>> x;

x.Add() << 1 << 2 << 3;

for(auto i : x) { LOG(i); }

results in run-time error, whereas the equivalent code with std::vector compiles but silently performs deep copy for each iteration:

std::vector<std::vector<int>> sv;

sv.push_back({1, 2, 3});

for(auto i : sv) // invokes std::vector<int> copy constructor

for(auto j : i)

DUMP(j);

That said, in certain cases it is simpler to have default copy instead of explicit clone. You can easily achieve that using WithDeepCopy template:

WithDeepCopy<Vector<int>> v2;

v2 = v;

DUMP(v);

DUMP(v2);

v = [1, 2]

v2 = [1, 2]

3.4 Client types in U++ containers

So far we were using int as type of elements. In order to store client defined types into the Vector (and the Vector flavor) the type must satisfy moveable requirement - in short, it must not contain back-pointers nor virtual methods. Type must be marked as moveable in order to define interface contract using Moveable CRTP idiom:

struct Distribution : Moveable<Distribution> {

String text;

Vector<int> data;

String ToString() const { return text + ": " + AsString(data); }

};

Now to add Distribution elements you cannot use Vector::Add(const T&), because it requires elements to have default deep-copy constructor - and Distribution does not have one, as Vector<int>` has default pick-constructor, so Distribution itself has pick-constructor. It would no be a good idea either, because deep-copy would involve expensive copying of inner Vector.

Instead, Add without parameters has to be used - it default constructs (that is cheap) element in Vector and returns reference to it:

Vector<Distribution> dist;

for(int n = 5; n <= 10; n++) {

Distribution& d = dist.Add();

d.text << "Test " << n;

for(int i = 0; i < 10000; i++)

d.data.At(Random(n), 0)++;

}

DUMPC(dist);

dist:

[0] = Test 5: [2015, 1974, 1969, 2061, 1981]

[1] = Test 6: [1702, 1637, 1689, 1631, 1694, 1647]

[2] = Test 7: [1431, 1457, 1427, 1454, 1415, 1391, 1425]

[3] = Test 8: [1304, 1275, 1200, 1193, 1217, 1290, 1285, 1236]

[4] = Test 9: [1088, 1140, 1093, 1123, 1138, 1115, 1119, 1086, 1098]

[5] = Test 10: [994, 1068, 993, 977, 1009, 1018, 960, 961, 1048, 972]

Another possibility is to use Vector::Add(T&&) method, which uses pick-constructor instead of deep-copy constructor. E.g. Distribution elements might be generated by some function:

Distribution CreateDist(int n);

and code for adding such elements to Vector then looks like:

for(n = 5; n <= 10; n++)

dist.Add(CreateDist(n));

alternatively, you can use default-constructed variant too

dist.Add() = CreateDist();

3.5 Array flavor

If elements are not Moveable and therefore cannot be stored in Vector flavor, they can still be stored in Array flavor. Another reason for using Array is the need for referencing elements - Array flavor never invalidates references or pointers to them. Finally, if sizeof(T) is large (say more than 100-200 bytes), using Array might be better from performance perspective.

Example of elements that cannot be stored in Vector flavor are standard library objects like std::string (because obviously, standard library knows nothing about U++ Moveable concept):

Array<std::string> as;

for(int i = 0; i < 4; i++)

as.Add("Test");

for(auto s : as)

DUMP(s.c_str());

s.c_str() = Test

3.6 Polymorphic Array

Array can even be used for storing polymorphic elements:

struct Number {

virtual double Get() const = 0;

String ToString() const { return AsString(Get()); }

virtual ~Number() {}

};

struct Integer : public Number {

int n;

virtual double Get() const { return n; }

};

struct Double : public Number {

double n;

virtual double Get() const { return n; }

};

To add such derived types to Array, you can best use in-place creation with Create method:

Array<Number> num;

num.Create<Double>().n = 15.5;

num.Create<Integer>().n = 3;

DUMP(num);

num = [15.5, 3]

Alternatively, you can use Add(T *) method and provide a pointer to the newly created instance on the heap (Add returns a reference to the instance):

Double *nd = new Double;

nd->n = 1.1;

num.Add(nd);

DUMP(num);

num = [15.5, 3, 1.1]

Array takes ownership of heap object and deletes it as appropriate. We recommend to use this variant only if in-place creation with Create is not possible.

It is OK do directly apply U++ algorithms on Array (the most stringent requirement of any of basic algorithms is that there is IterSwap provided for container iterators and that is specialized for Array iterators):

Sort(num, [](const Number& a, const Number& b) { return a.Get() < b.Get(); });

DUMP(num);

num = [1.1, 3, 15.5]

3.7 Bidirectional containers

Vector and Array containers allow fast adding and removing elements at the end of sequence. Sometimes, same is needed at begin of sequence too (usually to support FIFO queues). BiVector and BiArray are optimal for this scenario:

BiVector<int> n;

n.AddHead(1);

n.AddTail(2);

n.AddHead(3);

n.AddTail(4);

DUMP(n);

n = [3, 1, 2, 4]

n.DropHead();

DUMP(n);

n = [1, 2, 4]

n.DropTail();

DUMP(n);

n = [1, 2]

struct Val {

virtual String ToString() const = 0;

virtual ~Val() {}

};

struct Number : Val {

int n;

virtual String ToString() const { return AsString(n); }

};

struct Text : Val {

String s;

virtual String ToString() const { return s; }

};

BiArray<Val> num;

num.CreateHead<Number>().n = 3;

num.CreateTail<Text>().s = "Hello";

num.CreateHead<Text>().s = "World";

num.CreateTail<Number>().n = 2;

DUMP(num);

num = [World, 3, Hello, 2]

3.8 Index

Index is the the foundation of all U++ associative operations and is one of defining features of U++.

Index is a container very similar to the plain Vector (it is random access array of elements with fast addition at the end) with one additional feature - it is able to fast retrieve position of element with required value using Find method:

Index<String> ndx;

ndx.Add("alfa");

ndx.Add("beta");

ndx.Add("gamma");

ndx.Add("delta");

ndx.Add("kappa");

DUMP(ndx);

DUMP(ndx.Find("beta"));

ndx = [alfa, beta, gamma, delta, kappa]

ndx.Find("beta") = 1

If element is not present in Index, Find returns a negative value:

DUMP(ndx.Find("something"));

ndx.Find("something") = -1

Any element can be replaced using Set method:

ndx.Set(1, "alfa");

DUMP(ndx);

ndx = [alfa, alfa, gamma, delta, kappa]

If there are more elements with the same value, they can be iterated using FindNext method:

int fi = ndx.Find("alfa");

while(fi >= 0) {

DUMP(fi);

fi = ndx.FindNext(fi);

}

fi = 0

fi = 1

FindAdd method retrieves position of element like Find, but if element is not present in Index, it is added:

DUMP(ndx.FindAdd("one"));

DUMP(ndx.FindAdd("two"));

DUMP(ndx.FindAdd("three"));

DUMP(ndx.FindAdd("two"));

DUMP(ndx.FindAdd("three"));

DUMP(ndx.FindAdd("one"));

ndx.FindAdd("one") = 5

ndx.FindAdd("two") = 6

ndx.FindAdd("three") = 7

ndx.FindAdd("two") = 6

ndx.FindAdd("three") = 7

ndx.FindAdd("one") = 5

Removing elements from random access sequence tends to be expensive, that is why rather than remove, Index supports Unlink and UnlinkKey operations, which retain the element in Index but make it invisible for Find operation:

ndx.Unlink(2);

ndx.UnlinkKey("kappa");

DUMP(ndx.Find(ndx[2]));

DUMP(ndx.Find("kappa"));

ndx.Find(ndx[2]) = -1

ndx.Find("kappa") = -1

You can test whether element at given position is unlinked using IsUnlinked method

DUMP(ndx.IsUnlinked(1));

DUMP(ndx.IsUnlinked(2));

ndx.IsUnlinked(1) = false

ndx.IsUnlinked(2) = true

Unlinked positions can be reused by Put method:

ndx.Put("foo");

DUMP(ndx);

DUMP(ndx.Find("foo"));

ndx = [alfa, alfa, gamma, delta, foo, one, two, three]

ndx.Find("foo") = 4

You can also remove all unlinked elements from Index using Sweep method:

ndx.Sweep();

DUMP(ndx);

ndx = [alfa, alfa, delta, foo, one, two, three]

Operations directly removing or inserting elements of Index are expensive, but available too:

ndx.Remove(1);

DUMP(ndx);

ndx = [alfa, delta, foo, one, two, three]

ndx.RemoveKey("two");

DUMP(ndx);

ndx = [alfa, delta, foo, one, three]

ndx.Insert(0, "insert");

DUMP(ndx);

ndx = [insert, alfa, delta, foo, one, three]

PickKeys operation allows you to obtain Vector of elements of Index in low constant time operation (while destroying source Index)

Vector<String> d = ndx.PickKeys();

DUMP(d);

d = [insert, alfa, delta, foo, one, three]

Pick-assigning Vector to Index is supported as well:

d[0] = "test";

ndx = pick(d);

DUMP(ndx);

ndx = [test, alfa, delta, foo, one, three]

3.9 Index and client types

In order to store elements to Index, they type must be Moveable, have deep copy and defined the operator== and a GetHashValue function or method to compute the hash code. It is recommended to use CombineHash to combine hash values of types that already provide GetHashValue:

struct Person : Moveable<Person> {

String name;

String surname;

unsigned GetHashValue() const { return CombineHash(name, surname); }

bool operator==(const Person& b) const { return name == b.name && surname == b.surname; }

Person(String name, String surname) : name(name), surname(surname) {}

Person() {}

};

Index<Person> p;

p.Add(Person("John", "Smith"));

p.Add(Person("Paul", "Carpenter"));

p.Add(Person("Carl", "Engles"));

DUMP(p.Find(Person("Paul", "Carpenter")));

p.Find(Person("Paul", "Carpenter")) = 1

3.10 VectorMap, ArrayMap

VectorMap is nothing else than a simple composition of Index of keys and Vector of values. You can use Add methods to put elements into the VectorMap:

struct Person : Moveable<Person> {

String name;

String surname;

String ToString() const { return String() << name << ' ' << surname; }

Person(String name, String surname) : name(name), surname(surname) {}

Person() {}

};

VectorMap<String, Person> m;

m.Add("1", Person("John", "Smith"));

m.Add("2", Person("Carl", "Engles"));

Person& p = m.Add("3");

p.name = "Paul";

p.surname = "Carpenter";

DUMP(m);

m = {1: John Smith, 2: Carl Engles, 3: Paul Carpenter}

VectorMap provides read-only access to its Index of keys and read-write access to its Vector of values:

DUMP(m.GetKeys());

DUMP(m.GetValues());

m.GetKeys() = [1, 2, 3]

m.GetValues() = [John Smith, Carl Engles, Paul Carpenter]

m.GetValues()[2].name = "Peter";

DUMP(m);

m = {1: John Smith, 2: Carl Engles, 3: Peter Carpenter}

You can use indices to iterate VectorMap contents:

for(int i = 0; i < m.GetCount(); i++)

LOG(m.GetKey(i) << ": " << m[i]);

1: John Smith

2: Carl Engles

3: Peter Carpenter

Standard begin / end pair for VectorMap is the range of just values (internal Vector) - it corresponds with operator[] returning values:

for(const auto& p : m)

DUMP(p);

p = John Smith

p = Carl Engles

p = Peter Carpenter

To iterate through keys, you can use begin/`end` of internal Index:

for(const auto& p : m.GetKeys())

DUMP(p);

p = 1

p = 2

p = 3

Alternatively, it is possible to create 'projection range' of VectorMap that provides convenient key/value iteration, using operator~ (note that is also removes 'unlinked' items, see later):

for(const auto& e : ~m) {

DUMP(e.key);

DUMP(e.value);

}

e.key = 1

e.value = John Smith

e.key = 2

e.value = Carl Engles

e.key = 3

e.value = Peter Carpenter

Note that the 'projection range' obtained by operator~ is temporary value, which means that if mutating operation is required for values, r-value reference has to be used instead of plain reference:

for(const auto& e : ~m)

if(e.key == "2")

e.value.surname = "May";

DUMP(m);

m = {1: John Smith, 2: Carl May, 3: Peter Carpenter}

You can use Find method to retrieve position of element with required key:

DUMP(m.Find("2"));

m.Find("2") = 1

or Get method to retrieve corresponding value:

DUMP(m.Get("2"));

m.Get("2") = Carl May

Passing key not present in VectorMap as Get parameter is undefined behavior (ASSERT fails in debug mode), but there exists two parameter version of Get that returns second parameter if the key is not found in VectorMap:

DUMP(m.Get("33", Person("unknown", "person")));

m.Get("33", Person("unknown", "person")) = unknown person

As with Index, you can use Unlink to make elements invisible for Find operations:

m.Unlink(1);

DUMP(m.Find("2"));

m.Find("2") = -1

SetKey changes the key of the element:

m.SetKey(1, "33");

DUMP(m.Get("33", Person("unknown", "person")));

m.Get("33", Person("unknown", "person")) = Carl May

If there are more elements with the same key in VectorMap, you can iterate them using FindNext method:

m.Add("33", Person("Peter", "Pan"));

int q = m.Find("33");

while(q >= 0) {

DUMP(m[q]);

q = m.FindNext(q);

}

m[q] = Carl May

m[q] = Peter Pan

Unlinked positions can be 'reused' using Put method:

m.UnlinkKey("33");

m.Put("22", Person("Ali", "Baba"));

m.Put("44", Person("Ivan", "Wilks"));

DUMP(m);

m = {1: John Smith, 44: Ivan Wilks, 3: Peter Carpenter, 22: Ali Baba}

PickValues / PickIndex / PickKeys / pick internal Vector / Index / Vector of Index:

Vector<Person> ps = m.PickValues();

Vector<String> ks = m.PickKeys();

DUMP(ps);

DUMP(ks);

DUMP(m);

ps = [John Smith, Ivan Wilks, Peter Carpenter, Ali Baba]

ks = [1, 44, 3, 22]

m = {}

VectorMap pick constructor to create map by picking:

ks[0] = "Changed key";

m = VectorMap<String, Person>(pick(ks), pick(ps));

DUMP(m);

m = {Changed key: John Smith, 44: Ivan Wilks, 3: Peter Carpenter, 22: Ali Baba}

ArrayMap is composition of Index and Array, for cases where Array is better fit for value type (e.g. they are polymorphic):

ArrayMap<String, Person> am;

am.Create<Person>("key", "new", "person");

DUMP(am);

am = {key: new person}

3.11 One

One is a container that can store none or one element of T or derived from T. It is functionally quite similar to std::unique_ptr, but has some convenient features.

struct Base {

virtual String Get() = 0;

virtual ~Base() {}

};

struct Derived1 : Base {

virtual String Get() { return "Derived1"; }

};

struct Derived2 : Base {

virtual String Get() { return "Derived2"; }

};

One<Base> s;

operator bool of one returns true if it contains an element:

DUMP((bool)s);

(bool)s = false

s.Create<Derived1>();

DUMP((bool)s);

DUMP(s->Get());

(bool)s = true

s->Get() = Derived1

You can use Is to check if certain type is currently stored in One:

DUMP(s.Is<Derived1>());

DUMP(s.Is<Base>());

DUMP(s.Is<Derived2>());

s.Is<Derived1>() = true

s.Is<Base>() = true

s.Is<Derived2>() = false

To get a pointer to the contained instance, use operator~:

Base *b = ~s;

DUMP(b->Get());

b->Get() = Derived1

Clear method removes the element from One:

s.Clear();

DUMP((bool)s);

(bool)s = false

Helper function MakeOne derived from One can be used to create contained element:

s = MakeOne<Derived1>();

DUMP(s->Get());

s->Get() = Derived1

auto t = pick(s);

DUMP(t->Get());

t->Get() = Derived1

3.12 Any

Any is a container that can contain none or one element of any type. Any::Is method matches exact type ignoring class hierarchies (unlike One::Is). You can use Get to retrieve a reference to the instance stored:

for(int pass = 0; pass < 2; pass++) {

Any x;

if(pass)

x.Create<String>() = "Hello!";

else

x.Create<Color>() = Blue();

if(x.Is<String>())

LOG("Any is now String: " << x.Get<String>());

if(x.Is<Color>())

LOG("Any is now Color: " << x.Get<Color>());

}

Any is now Color: Color(0, 0, 128)

Any is now String: Hello!

3.13 InVector, InArray

InVector and InArray are container types quite similar to Vector/`Array`, but they trade the speed of operator[] with the ability to insert or remove elements at any position quickly. You can expect operator[] to be about 10 times slower than in Vector (but that is still quite fast), while Insert at any position scales well up to hundreds of megabytes of data (e.g. InVector containing 100M of String elements is handled without problems).

InVector<int> v;

for(int i = 0; i < 1000000; i++)

v.Add(i);

v.Insert(0, -1); // This is fast

While the interface of InVector/`InArray` is almost identical to Vector/`Array`, InVector/`InArray` in addition implements FindLowerBound/`FindUpperBound` methods - while normal generic range algorithms work, it is possible to provide InVector/`InArray` specific optimizations that basically match the performace of Find*Bound on simple Vector.

DUMP(v.FindLowerBound(55));

v.FindLowerBound(55) = 56

3.14 SortedIndex, SortedVectorMap, SortedArrayMap

SortedIndex is similar to regular Index, but keeps its elements in sorted order (sorting predicate is a template parameter, defaults to StdLess). Implementation is using InVector, so it works fine even with very large number of elements (performance is similar to tree based std::set). Unlike Index, SortedIndex provides lower/upper bounds searches, so it allows range search.

SortedIndex<int> x;

x.Add(5);

x.Add(3);

x.Add(7);

x.Add(1);

DUMPC(x);

DUMP(x.Find(3));

DUMP(x.FindLowerBound(3));

DUMP(x.FindUpperBound(6));

[0] = 1

[1] = 3

[2] = 5

[3] = 7

x.Find(3) = 1

x.FindLowerBound(3) = 1

x.FindUpperBound(6) = 3

SortedVectorMap and SortedArrayMap are then SortedIndex based equivalents to VectorMap/`ArrayMap`:

SortedVectorMap<String, int> m;

m.Add("zulu", 11);

m.Add("frank", 12);

m.Add("alfa", 13);

DUMPM(m);

DUMP(m.Get("zulu"));

[0] = (alfa) 13

[1] = (frank) 12

[2] = (zulu) 11

m.Get("zulu") = 11

3.15 Tuples

Template class Tuple allows combining 2-4 values with different types. These are principally similar to std::tuple, with some advantages. Unlike std::tuple, individual elements are directly accessible as member variables a..`d`, Tuple supports persistent storage patterns (Serialize, Jsonize, Xmlize), hash code (GetHashValue), conversion to String and Value conversions.

To create a Tuple value, you can use the MakeTuple function.

Tuple<int, String, String> x = MakeTuple(12, "hello", "world");

Individual values are accessible as members a .. d:

DUMP(x.a);

DUMP(x.b);

DUMP(x.c);

x.a = 12

x.b = hello

x.c = world

Or using Get:

DUMP(x.Get<1>());

DUMP(x.Get<int>());

x.Get<1>() = hello

x.Get<int>() = 12

As long as all individual types have conversion to String (AsString), the tuple also has such conversion and thus can e.g. be easily logged:

DUMP(x);

x = (12, hello, world)

As long as individual types have defined GetHashValue, so does Tuple:

DUMP(GetHashValue(x));

GetHashValue(x) = 2387239184

As long as individual types have defined operator==, Tuple has defined operator== and operator!=:

Tuple<int, String, String> y = x;

DUMP(x == y);

DUMP(x != y);

y.a++;

DUMP(x == y);

DUMP(x != y);

x == y = true

x != y = false

x == y = false

x != y = true

As long as all individual types have defined SgnCompare, Tuple has SgnCompare, Compare method and operators <, <=, >, >=:

DUMP(x.Compare(y));

DUMP(SgnCompare(x, y));

DUMP(x < y);

x.Compare(y) = -1

SgnCompare(x, y) = -1

x < y = true

GetCount returns the width of Tuple:

DUMP(x.GetCount());

x.GetCount() = 3

Elements that are directly convertible with Value can be 'Get'/'Set':

for(int i = 0; i < x.GetCount(); i++)

DUMP(x.Get(i));

x.Get(i) = 12

x.Get(i) = hello

x.Get(i) = world

x.Set(1, "Hi");

DUMP(x);

x = (12, Hi, world)

As long as all individual types are convertible with Value, you can convert Tuple to ValueArray and back:

ValueArray va = x.GetArray();

DUMP(va);

va.Set(2, "Joe");

x.SetArray(va);

va = [12, Hi, world]

It is OK to assign Tuple to Tuple with different individual types, as long as types are directly convertible:

Tuple<double, String, String> d = x;

DUMP(d);

d = (12, Hi, Joe)

Tie can be used to assign tuple to l-values:

int i;

String s1, s2;

Tie(i, s1, s2) = x;

DUMP(i);

DUMP(s1);

DUMP(s2);

i = 12

s1 = Hi

s2 = Joe

U++ Tuples are carefully designed as POD type, which allows POD arrays to be intialized with classic C style:

static Tuple2<int, const char *> map[] = {

{ 1, "one" },

{ 2, "one" },

{ 3, "one" },

};

Simple FindTuple template function is provided to search for tuple based on the first value (a) (linear O(n) search):

DUMP(FindTuple(map, __countof(map), 3)->b);

FindTuple(map, __countof(map), 3)->b = one

4. Ranges and algorithms

4.1 Range

Unlike STL, which interface algorithms with data using begin / end pair, U++ algorithms usually work on Ranges. Range is an object that has begin / end methods providing random access to elements (all U++ containers are random access), operator[] and GetCount method.

Obviously, U++ containers are ranges:

Vector<int> x = { 1, 2, 3, 4, 5, 1, 2, 3, 4 };

DUMP(FindIndex(x, 2)); // FindIndex is a trivial algorithm that does linear search

FindIndex(x, 2) = 1

If you want the algorithm to run on part of container only, you can use SubRange instance:

DUMP(SubRange(x, 3, 6));

DUMP(FindIndex(SubRange(x, 3, 6), 4));

SubRange(x, 3, 6) = [4, 5, 1, 2, 3, 4]

FindIndex(SubRange(x, 3, 6), 4) = 0

As a side-job, SubRange can also be created from 'begin' / 'end' pair, thus e.g. allowing algorithms to work on C arrays:

int a[] = { 1, 22, 4, 2, 8 };

auto ar = SubRange(std::begin(a), std::end(a));

DUMP(ar);

ar = [1, 22, 4, 2, 8]

Sort(ar);

DUMP(ar);

ar = [1, 2, 4, 8, 22]

There are some macro aliases that make type management of ranges easier:

DUMP(typeid(ValueTypeOf<decltype(x)>).name());

DUMP(typeid(ValueTypeOf<decltype(SubRange(x, 1, 1))>).name());

DUMP(typeid(IteratorOf<decltype(x)>).name());

DUMP(typeid(ConstIteratorOf<decltype(SubRange(x, 1, 1))>).name());

DUMP(typeid(SubRangeOf<Vector<int>>).name());

typeid(ValueTypeOf<decltype(x)>).name() = i

typeid(ValueTypeOf<decltype(SubRange(x, 1, 1))>).name() = i

typeid(IteratorOf<decltype(x)>).name() = Pi

typeid(ConstIteratorOf<decltype(SubRange(x, 1, 1))>).name() = Pi

typeid(SubRangeOf<Vector<int>>).name() = N3Upp13SubRangeClassIPiEE

While containers themselves and SubRange are the two most common range types, U++ has two special ranges. ConstRange simply provides the range of single value:

DUMP(ConstRange(1, 10));

ConstRange(1, 10) = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

ReverseRange reverses the order of elements in the source range:

Vector<int> v{ 1, 2, 3, 4 };

DUMP(ReverseRange(v));

ReverseRange(v) = [4, 3, 2, 1]

ViewRange picks a source range and Vector of integer indices a provides a view of source range through this Vector:

Vector<int> h{ 2, 4, 0 };

DUMP(ViewRange(x, clone(h)));

ViewRange(x, clone(h)) = [3, 5, 1]

Sort(ViewRange(x, clone(h)));

DUMP(ViewRange(x, clone(h)));

DUMP(x);

ViewRange(x, clone(h)) = [1, 3, 5]

x = [5, 2, 1, 4, 3, 1, 2, 3, 4]

SortedRange returns range sorted by predicate (default is std::less):

DUMP(SortedRange(x));

SortedRange(x) = [1, 1, 2, 2, 3, 3, 4, 4, 5]

Finally FilterRange creates a subrange of elements satisfying certain condition:

DUMP(FilterRange(x, [](int x) { return x > 3; }));

FilterRange(x, [](int x) { return x > 3; }) = [5, 4, 4]

Various Range functions can be combined to produce complex results:

DUMP(ReverseRange(FilterRange(x, [](int x) { return x < 4; })));

ReverseRange(FilterRange(x, [](int x) { return x < 4; })) = [3, 2, 1, 3, 1, 2]

4.2 Algorithms

In principle, is is possible to apply C++ standard library algorithms on U++ containers or ranges.

U++ algorithms are tuned for U++ approach - they work on ranges and they prefer indices. Sometimes, U++ algorithm will perform faster with U++ types than standard library algorithm.

FindIndex performs linear search to find element with given value and returns its index or -1 if not found:

Vector<int> data { 5, 3, 7, 9, 3, 4, 2 };

DUMP(FindIndex(data, 3));

DUMP(FindIndex(data, 6));

FindIndex(data, 3) = 1

FindIndex(data, 6) = -1

SubRange can be used to apply algorithm on subrange of container:

DUMP(FindIndex(SubRange(data, 2, data.GetCount() - 2), 3));

FindIndex(SubRange(data, 2, data.GetCount() - 2), 3) = 2

FindMin and FindMax return the index of minimal / maximal element:

DUMP(FindMin(data));

DUMP(FindMax(data));

FindMin(data) = 6

FindMax(data) = 3

Min and Max return the value of minimal / maximal element:

DUMP(Min(data));

DUMP(Max(data));

Min(data) = 2

Max(data) = 9

If the range is empty, Min and Max are undefined (ASSERT fails in debug mode), unless the value is specified as second parameter to be used in this case:

Vector<int> empty;

// DUMP(Min(empty)); // This is undefined (fails in ASSERT)

DUMP(Min(empty, -99999));

Min(empty, -99999) = -99999

Count returns the number of elements with specified value, CountIf the number of elements that satisfy predicate:

DUMP(Count(data, 11));

DUMP(CountIf(data, [=](int c) { return c >= 5; }));

Count(data, 11) = 0

CountIf(data, [=](int c) { return c >= 5; }) = 3

Sum return the sum of all elements in range:

DUMP(Sum(data));

Sum(data) = 33

Sorted containers can be searched with bisection. U++ provides usual upper / lower bound algorithms. FindBinary returns the index of element with given value or -1 if not found:

data = { 5, 7, 9, 9, 14, 20, 23, 50 };

// 0 1 2 3 4 5 6 7

DUMP(FindLowerBound(data, 9));

DUMP(FindUpperBound(data, 9));

DUMP(FindBinary(data, 9));

DUMP(FindLowerBound(data, 10));

DUMP(FindUpperBound(data, 10));

DUMP(FindBinary(data, 10));

FindLowerBound(data, 9) = 2

FindUpperBound(data, 9) = 4

FindBinary(data, 9) = 2

FindLowerBound(data, 10) = 4

FindUpperBound(data, 10) = 4

FindBinary(data, 10) = -1

4.3 Sorting

Unsurprisingly, Sort function sorts a range. You can specify sorting predicate, default is operator<:

Vector<String> x { "1", "2", "10" };

Sort(x);

DUMP(x);

x = [1, 10, 2]

Sort(x, [](const String& a, const String& b) { return atoi(a) < atoi(b); });

DUMP(x);

x = [1, 2, 10]

IndexSort is sort variant that is able to sort two ranges (like Vector or Array) of the same size, based on values in the first range:

Vector<int> a { 5, 10, 2, 9, 7, 3 };

Vector<String> b { "five", "ten", "two", "nine", "seven", "three" };

IndexSort(a, b);

DUMP(a);

DUMP(b);

a = [2, 3, 5, 7, 9, 10]

b = [two, three, five, seven, nine, ten]

IndexSort(b, a);

DUMP(a);

DUMP(b);

a = [5, 9, 7, 10, 3, 2]

b = [five, nine, seven, ten, three, two]

There are also IndexSort2 and IndexSort3 variants that sort 2 or 3 dependent ranges.

Sometimes, instead of sorting items in the range, it is useful to know the order of items as sorted, using GetSortOrder:

Vector<int> o = GetSortOrder(a);

DUMP(o);

o = [5, 4, 0, 2, 1, 3]

Normal Sort is not stable - equal items can appear in sorted range in random order. If maintaining original order of equal items is important, use StableSort variant (with performance penalty):

Vector<Point> t { Point(10, 10), Point(7, 1), Point(7, 2), Point(7, 3), Point(1, 0) };

StableSort(t, [](const Point& a, const Point& b) { return a.x < b.x; });

DUMP(t);

t = [[1, 0], [7, 1], [7, 2], [7, 3], [10, 10]]

All sorting algorithms have they 'Stable' variant, so there is StableIndexSort, GetStableSortOrder etc...

5. Value

5.1 Value

Value is sort of equivalent of polymorphic data types from scripting languages like Python or JavaSript. Value can represent values of concrete types, some types also have extended interoperability with Value and it is then possible to e.g. compare Values containing such types against each other or serialize them for persistent storage.

Usually, Value compatible types define typecast operator to Value and constructor from Value, so that interaction is for the most part seamless:

Value a = 1;

Value b = 2.34;

Value c = GetSysDate();

Value d = "hello";

DUMP(a);

DUMP(b);

DUMP(c);

DUMP(d);

int x = a;

double y = b;

Date z = c;

String s = d;

DUMP(x);

DUMP(y);

DUMP(z);

DUMP(s);

a = 1

b = 2.34

c = 09/08/2019

d = hello

x = 1

y = 2.34

z = 09/08/2019

s = hello

As for primitive types, Value seamlessly works with int, int64, bool and double. Casting Value to a type that it does not contain throws an exception:

try {

s = a;

DUMP(s); // we never get here....

}

catch(ValueTypeError) {

LOG("Failed Value conversion");

}

Failed Value conversion

However, conversion between related types is possible (as long as it is supported by these types):

double i = a;

int j = b;

Time k = c;

WString t = d;

DUMP(i);

DUMP(j);

DUMP(k);

DUMP(t);

i = 1

j = 2

k = 09/08/2019 00:00:00

t = hello

To determine type of value stored in Value, you can use Is method:

DUMP(a.Is<int>());

DUMP(a.Is<double>());

DUMP(b.Is<double>());

DUMP(c.Is<int>());

DUMP(c.Is<Date>());

DUMP(d.Is<String>());

a.Is<int>() = true

a.Is<double>() = false

b.Is<double>() = true

c.Is<int>() = false

c.Is<Date>() = true

d.Is<String>() = true

Note that Is tests for absolute type match, not for compatible types. For that reason, for widely used compatible types helper functions are defined:

DUMP(IsNumber(a));

DUMP(IsNumber(b));

DUMP(IsDateTime(c));

DUMP(IsString(d));

IsNumber(a) = true

IsNumber(b) = true

IsDateTime(c) = true

IsString(d) = true

5.2 Null

U++ defines a special Null constant to represent an empty value. This constant is convertible to many value types including primitive types double, int and int64 (defined as lowest number the type can represent). If type supports ordering (<, >), all values of the type are greater than Null value. To test whether a value is empty, use IsNull function.

int x = Null;

int y = 120;

Date d = Null;

Date e = GetSysDate();

DUMP(x);

DUMP(y);

DUMP(d);

DUMP(e > d);

x =

y = 120

d =

e > d = true

Null is the only instance of Nuller type. Assigning Null to primitive types is achieved by cast operators of Nuller, other types can do it using constructor from Nuller.

As a special case, if Value contains Null, it is convertible to any value type that can contain Null:

Value v = x; // x is int

e = v; // e is Date, but v is Null, so Null is assigned to e

DUMP(IsNull(e));

IsNull(e) = true

Function Nvl is U++ analog of well known SQL function coalesce (ifnull, Nvl), which returns the first non-null argument (or Null if all are Null).

int a = Null;

int b = 123;

int c = 1;

DUMP(Nvl(a, b, c));

Nvl(a, b, c) = 123

5.3 Client types and Value, RawValue, RichValue

There are two Value compatibility levels. The simple one, RawValue, has little requirements for the type used - only copy constructor and assignment operator are required (and there are even forms of RawValue that work for types missing these):

struct RawFoo {

String x;

// default copy constructor and assignment operator are provided by compiler

};

To convert such type to Value, use RawToValue:

RawFoo h;

h.x = "hello";

Value q = RawToValue(h);

DUMP(q.Is<RawFoo>());

q.Is<RawFoo>() = true

To convert it back, us 'To' templated member function of Value, it returns a constant reference to the value:

DUMP(q.To<RawFoo>().x);

q.To<RawFoo>().x = hello

RichValue level Values provide more operations for Value - equality test, IsNull test, hashing, conversion to text, serialization (possibly to XML and Json), comparison. In order to make serialization work, type must also have assigned an integer id (client types should use ids in range 10000..20000). Type can provide the support for these operations via template function specializations or (perhaps more convenient) using defined methods and inheriting from ValueType base class template:

struct Foo : ValueType<Foo, 10010> {

int x;

Foo(const Nuller&) { x = Null; }

Foo(int x) : x(x) {}

Foo() {}

// We provide these methods to allow automatic conversion of Foo to/from Value

operator Value() const { return RichToValue(*this); }

Foo(const Value& v) { *this = v.Get<Foo>(); }

String ToString() const { return AsString(x); }

unsigned GetHashValue() const { return x; }

void Serialize(Stream& s) { s % x; }

bool operator==(const Foo& b) const { return x == b.x; }

bool IsNullInstance() const { return IsNull(x); }

int Compare(const Foo& b) const { return SgnCompare(x, b.x); }

// This type does not define XML nor Json serialization

};

INITBLOCK { // This has to be at file level scope

Value::Register<Foo>(); // need to register value type integer id to allow serialization

}

Value a = Foo(54321); // uses Foo::operator Value

Value b = Foo(54321);

Value c = Foo(600);

DUMP(a); // uses Foo::ToString

DUMP(a == b); // uses Foo::operator==

DUMP(a == c);

DUMP(c < a); // uses Foo::Compare

DUMP(IsNull(a)); // uses Foo::IsNullInstance

Foo foo = c; // Uses Foo::Foo(const Value&)

DUMP(foo);

a = 54321

a == b = true

a == c = false

c < a = true

IsNull(a) = false

foo = 600

String s = StoreAsString(a); // Uses Foo::Serialize

Value loaded;

// Using registered (Value::Registered) integer id creates the correct type, then uses

// Foo::Serialize to load the data from the stream

LoadFromString(loaded, s);

DUMP(loaded);

loaded = 54321

5.4 ValueArray and ValueMap

ValueArray is a type that represents an array of Values:

ValueArray va{1, 2, 3};

DUMP(va);

va = [1, 2, 3]

ValueArray can be assigned to Value (and back):

Value v = va;

DUMP(v);

DUMP(v.Is<ValueArray>()); // must be exactly ValueArray

DUMP(IsValueArray(v)); // is ValueArray or ValueMap (which is convertible to ValueArray)

ValueArray va2 = v;

DUMP(va2);

v = [1, 2, 3]

v.Is<ValueArray>() = true

IsValueArray(v) = true

va2 = [1, 2, 3]

Elements can be appended using Add method or operator<<, element at index can be changed with Set:

va.Add(10);

va << 20 << 21;

va.Set(0, 999);

DUMP(va);

va = [999, 2, 3, 10, 20, 21]

Elements can be removed:

va.Remove(0, 2);

DUMP(va);

va = [3, 10, 20, 21]

and inserted:

va.Insert(1, v);

DUMP(va);

va = [3, 1, 2, 3, 10, 20, 21]

It is possible to get a reference to element at index, however note that some special rules apply here:

va.At(0) = 222;

DUMP(va);

va = [222, 1, 2, 3, 10, 20, 21]

If Value contains ValueArray, Value::GetCount method returns the number of elements in the array (if there is no ValueArray in Value, it returns zero). You can use Value::operator[](int) to get constant reference to ValueArray elements:

for(int i = 0; i < v.GetCount(); i++)

LOG(v[i]);

It is even possible to directly add element to Value if it contains ValueArray:

v.Add(4);

DUMP(v);

v = [1, 2, 3, 4]

Or even get a reference to element at some index (with special rules):

v.At(0) = 111;

DUMP(v);

v = [111, 2, 3, 4]

ValueMap can store key - value pairs and retrieve value for key quickly. Note that keys are not limited to String, but can be any Value with operator== and hash code defined.

Add method or operator() add data to ValueMap:

ValueMap m;

m.Add("one", 1);

m("two", 2)("three", 3);

DUMP(m);

m = { one: 1, two: 2, three: 3 }

operator[] retrieves the value at the key:

DUMP(m["two"]);

m["two"] = 2

When key is not present in the map, operator[] returns void Value (which is also Null):

DUMP(m["key"]);

DUMP(m["key"].IsVoid());

DUMP(IsNull(m["key"]));

m["key"] =

m["key"].IsVoid() = true

IsNull(m["key"]) = true

Just like VectorMap, ValueMap is ordered, so the order of adding pairs to it matters:

ValueMap m2;

m2.Add("two", 2);

m2("one", 1)("three", 3);

DUMP(m2);

DUMP(m == m2); // different order of adding means they are not equal

m2 = { two: 2, one: 1, three: 3 }

m == m2 = false

'Unordered' equality test can be done using IsSame:

DUMP(m.IsSame(m2));

m.IsSame(m2) = true

Iterating ValueMap can be achieved with GetCount, GetKey and GetValue:

for(int i = 0; i < m.GetCount(); i++)

LOG(m.GetKey(i) << " = " << m.GetValue(i));

one = 1

two = 2

three = 3

It is possible to get ValueArray of values:

LOG(m.GetValues());

[1, 2, 3]

GetKeys gets constant reference to Index<Value> of keys:

LOG(m.GetKeys());

[one, two, three]

It is possible to change the value with Set:

m.Set("two", 4);

DUMP(m);

m = { one: 1, two: 4, three: 3 }

Or to change the value of key with SetKey:

m.SetKey(1, "four");

DUMP(m);

m = { one: 1, four: 4, three: 3 }

It is possible get a reference of value at given key, (with special rules) with GetAdd or operator():

Value& h = m("five");

h = 5;

DUMP(m);

m = { one: 1, four: 4, three: 3, five: 5 }

When ValueMap is stored into Value, operator[](String) provides access to value at key. Note that this narrows keys to text values:

v = m;

DUMP(v);

DUMP(v["five"]);

v = { one: 1, four: 4, three: 3, five: 5 }

v["five"] = 5

Value::GetAdd and Value::operator() provide a reference to value at key, with special rules:

v.GetAdd("newkey") = "foo";

v("five") = "FIVE";

DUMP(v);

v = { one: 1, four: 4, three: 3, five: FIVE, newkey: foo }

ValueMap and ValueArray are convertible with each other. When assigning ValueMap to ValueArray, values are simply used:

ValueArray v2 = m;

DUMP(v2);

v2 = [1, 4, 3, 5]

When assigning ValueArray to ValueMap, keys are set as indices of elements:

ValueMap m3 = v2;

DUMP(m3);

m3 = { 0: 1, 1: 4, 2: 3, 3: 5 }

With basic Value types int, String, ValueArray and ValueMap, Value can represent JSON:

Value j = ParseJSON("{ \"array\" : [ 1, 2, 3 ] }");

DUMP(j);

j = { array: [1, 2, 3] }

j("value") = m;

DUMP(AsJSON(j));

AsJSON(j) = {"array":[1,2,3],"value":{"one":1,"four":4,"three":3,"five":5}}

j("array").At(1) = ValueMap()("key", 1);

DUMP(AsJSON(j));

AsJSON(j) = {"array":[1,{"key":1},3],"value":{"one":1,"four":4,"three":3,"five":5}}

6. Function and lambdas

6.1 Function

U++ Function is quite similar to std::function - it is a function wrapper that can store/copy/invoke any callable target. There are two important differences. First, invoking empty Function is NOP, if Function has return type T, it returns T(). Second, Function allows effective chaining of callable targets using operator<<, if Function has return type, the return type of last callable appended is used.

Usually, the callable target is C++11 lambda:

Function<int (int)> fn = [](int n) { LOG("Called A"); return 3 * n; };

LOG("About to call function");

int n = fn(7);

DUMP(n);

About to call function

Called A

n = 21

If you chain another lambda into Function, all are called, but the last one's return value is used:

fn << [](int n) { LOG("Called B"); return n * n; };

LOG("About to call combined function");

n = fn(7);

DUMP(n);

About to call combined function

Called A

Called B

n = 49

Invoking empty lambda does nothing and returns default constructed return value. This is quite useful for GUI classes, which have a lot of output events represented by Function which are often unassigned to any action.

fn.Clear();

LOG("About to call empty function");

n = fn(7);

DUMP(n);

About to call empty function

n = 0

While using Function with lambda expression is the most common, you can use any target that has corresponding operator() defined:

struct Functor {

int operator()(int x) { LOG("Called Foo"); return x % 2; }

};

fn = Functor();

LOG("About to call Functor");

n = fn(7);

DUMP(n);

About to call Functor

Called Foo

n = 1

As Function with void and bool return types are the most frequently used, U++ defines template aliases Event:

Event<> ev = [] { LOG("Event invoked"); };

ev();

Event invoked

and Gate:

Gate<int> gt = [](int x) { LOG("Gate invoked with " << x); return x < 10; };

bool b = gt(9);

DUMP(b);

b = gt(10);

DUMP(b);

Gate invoked with 9

b = true

Gate invoked with 10

b = false

Using lambda to define calls to methods with more parameters can be verbose and error-prone. The issue can be simplified by using THISFN macro:

struct Foo {

void Test(int a, const String& b) { LOG("Foo::Test " << a << ", " << b); }

typedef Foo CLASSNAME; // required for THISFN

void Do() {

Event<int, const String&> fn;

fn = [=](int a, const String& b) { Test(a, b); };

fn(1, "using lambda");

fn = THISFN(Test); // this is functionally equivalent, but less verbose

fn(2, "using THISFN");

}

};

Foo f;

f.Do();

Foo::Test 1, using lambda

Foo::Test 2, using THISFN

6.2 Capturing U++ containers into lambdas

Capturing objects with pick/clone semantics can be achieved using capture with an initializer:

Vector<int> x{ 1, 2 };

Array<String> y{ "one", "two" };

Event<> ev = [x = pick(x), y = clone(y)] { DUMP(x); DUMP(y); };

DUMP(x); // x is picked, so empty

DUMP(y); // y was cloned, so it retains original value

LOG("About to invoke event");

ev();

x = []

y = [one, two]

About to invoke event

x = [1, 2]

y = [one, two]

7. Multithreading

7.1 Thread

Since C++11, there is now a reasonable support for threads in standard library. There are however reasons to use U++ threads instead. One of them is that U++ high performance memory allocator needs a cleanup call at the the thread exit, which is naturally implemented into Upp::Thread. Second 'hard' reason is that Microsoft compiler is using Win32 API function for condition variable that are not available for Windows XP, while U++ has alternative implementation for Windows XP, thus making executable compatible with it.

Then of course we believe U++ multithreading / parallel programming support is easier to use and leads to higher performance...

Thread class can start the thread and allows launching thread to Wait for its completion:

Thread t;

t.Run([] {

for(int i = 0; i < 10; i++) {

LOG("In the thread " << i);

Sleep(100);

}

LOG("Thread is ending...");

});

for(int i = 0; i < 5; i++) {

LOG("In the main thread " << i);

Sleep(100);

}

LOG("About to wait for thread to finish");

t.Wait();

LOG("Wait for thread done");

In the main thread 0

In the thread 0

In the main thread 1

In the thread 1

In the main thread 2

In the thread 2

In the main thread 3

In the thread 3

In the main thread 4

In the thread 4

About to wait for thread to finish

In the thread 5

In the thread 6

In the thread 7

In the thread 8

In the thread 9

Thread is ending...

Wait for thread done

Thread destructor calls Detach method with 'disconnects' Thread from the thread. Thread continues running.

Thread::Start static method launches a thread without possibility to wait for its completion; if you need to wait, you have to use some other method:

bool x = false;

Thread::Start([&x] { LOG("In the Started thread"); x = true; });

LOG("About to wait for thread to finish");

while(!x) { Sleep(1); } // Do not do this in real code!

LOG("Wait for thread done");

About to wait for thread to finish

In the Started thread

Wait for thread done

(method used here is horrible, but should demonstrate the point).

7.2 Mutex

Mutex ("mutual exclusion") is a well known concept in multithreaded programming: When multiple threads write and read the same data, the access has to be serialized using Mutex. Following invalid code demonstrates why:

Thread t;

int sum = 0;

t.Run([&sum] {

for(int i = 0; i < 1000000; i++)

sum++;

});

for(int i = 0; i < 1000000; i++)

sum++;

t.Wait();

DUMP(sum);

sum = 1060530

While the expected value is 2000000, produced value is different. The problem is that both thread read / modify / write sum value without any locking. Using Mutex locks the sum and thus serializes access to it - read / modify / write sequence is now exclusive for the thread that has Mutex locked, this fixing the issue. Mutex can be locked / unlocked with Enter / Leave methods. Alternatively, Mutex::Lock helper class locks Mutex in constructor and unlocks it in destructor:

Mutex m;

sum = 0;

t.Run([&sum, &m] {

for(int i = 0; i < 1000000; i++) {

m.Enter();

sum++;

m.Leave();

}

});

for(int i = 0; i < 1000000; i++) {

Mutex::Lock __(m); // Lock m till the end of scope

sum++;

}

t.Wait();

DUMP(sum);

sum = 2000000

7.3 ConditionVariable

ConditionVariable in general is a synchronization primitive used to block/awaken the thread. ConditionVariable is associated with Mutex used to protect some data; in the thread that is to be blocked, Mutex has to locked; call to Wait atomically unlocks the Mutex and puts the thread to waiting. Another thread then can resume the thread by calling Signal, which also causes Mutex to lock again. Multiple threads can be waiting on single ConditionVariable; Signal resumes single waiting thread, Brodcast resumes all waitng threads.

bool stop = false;

BiVector<int> data;

Mutex m;

ConditionVariable cv;

Thread t;

t.Run([&stop, &data, &m, &cv] {

Mutex::Lock __(m);

for(;;) {

while(data.GetCount()) {

int q = data.PopTail();

LOG("Data received: " << q);

}

if(stop)

break;

cv.Wait(m);

}

});

for(int i = 0; i < 10; i++) {

{

Mutex::Lock __(m);

data.AddHead(i);

}

cv.Signal();

Sleep(1);

}

stop = true;

cv.Signal();

t.Wait();

Data received: 0

Data received: 1

Data received: 2

Data received: 3

Data received: 4

Data received: 5

Data received: 6

Data received: 7

Data received: 8

Data received: 9

Important note: rarely thread can be resumed from Wait even if no other called Signal. This is not a bug, but design decision for performance reason. In practice it only means that situation has to be (re)checked after resume.

7.4 CoWork

CoWork is intented to be use when thread are used to speedup code by distributing tasks over multiple CPU cores. CoWork spans a single set of worker threads that exist for the whole duration of program run. CoWork instances then manage assigning jobs to these worker threads and waiting for the all work to finish.

Job units to CoWork are represented by Function<void ()> and thus can be written inline as lambdas.

As an example, following code reads input file by lines, splits lines into words (this is the parallelized work) and then adds resulting words to Index:

FileIn in(GetDataFile("test.txt")); // let us open some tutorial testing data

Index<String> w;

Mutex m; // need mutex to serialize access to w

CoWork co;

while(!in.IsEof()) {

String ln = in.GetLine();

co & [ln, &w, &m] {

Vector<String> h = Split(ln, [](int c) { return IsAlpha(c) ? 0 : c; });

Mutex::Lock __(m);

for(const auto& s : h)

w.FindAdd(s);

};

}

co.Finish();

DUMP(w);

w = [Lorem, ipsum, dolor, sit, amet, consectetur, adipiscing, elit, Duis, aute, irure, in, reprehenderit, voluptate, velit, sed, do, eiusmod, tempor, incididunt, ut, labore, et, dolore, magna, ullamco, laboris, nisi, aliquip, ex, ea, commodo, consequat, officia, deserunt, mollit, anim, id, est, laborum, esse, cillum, eu, fugiat, nulla, pariatur, Excepteur, sint, occaecat, cupidatat, non, proident, sunt, culpa, qui, aliqua, Ut, enim, ad, minim, veniam, quis, nostrud, exercitation]

Adding words to w requires Mutex. Alternative to this 'result gathering' Mutex is CoWork::FinLock. The idea behind this is that CoWork requires an internal Mutex to serialize access to common data, so why FinLock locks this internal mutex a bit earlier, saving CPU cycles required to lock and unlock dedicated mutex. From API contract perspective, you can consider FinLock to serialize code till the end of worker job.

in.Seek(0);

while(!in.IsEof()) {

String ln = in.GetLine();

co & [ln, &w, &m] {

Vector<String> h = Split(ln, [](int c) { return IsAlpha(c) ? 0 : c; });

CoWork::FinLock(); // replaces the mutex, locked till the end of CoWork job

for(const auto& s : h)

w.FindAdd(s);

};

}

co.Finish();

DUMP(w);

Of course, the code performed after FinLock should not take long, otherwise there is negative impact on all CoWork instances. In fact, from this perspective, above code is probably past the threshold...

When exception is thrown in CoWork, it is propagated to the thread that calls Finish and CoWork is canceled. If more than single job throws, one of exceptions is selected randomly to be rethrown in Finish.

As CoWork destructor calls Finish too, it is possible that it will be thrown by destructor, which is not exactly recommended thing to do in C++, but is well defined and really the best option here:

in.Seek(0);

try {

while(!in.IsEof()) {

String ln = in.GetLine();

co & [ln, &w, &m] {

if(ln.GetCount() > 75)

throw "Input line was too long!";

Vector<String> h = Split(ln, [](int c) { return IsAlpha(c) ? 0 : c; });

CoWork::FinLock(); // replaces the mutex, locked till the end of CoWork job

for(const auto& s : h)

w.FindAdd(s);

};

}

co.Finish();

}

catch(const char *exception) {

DUMP(exception);

}

Sometimes there is a need for cancellation of the whole CoWork. Cancel method cancels all scheduled jobs that have not been yet executed and sets CoWork to canceled state, which can be checked in job routine using CoWork::IsCanceled:

for(int i = 0; i < 100; i++)

co & [] {

for(;;) {

if(CoWork::IsCanceled()) {

LOG("Job was canceled");

return;

}

Sleep(1);

}

};

Sleep(200); // Give CoWork a chance to start some jobs

co.Cancel();

Job was canceled

Canceling CoWork is common in GUI applications.

7.5 AsyncWork

AsyncWork is CoWork based tool that resembles std::future. AsyncWork instances are created using Async function and represent a work that can be done in parallel with current thread. AsyncWork supports returning values. A call to AsyncWork::Get makes sure that a work routine was finished and returns the return value (if any):

auto a = Async([](int n) -> double {

double f = 1;

for(int i = 2; i <= n; i++)

f *= i;

return f;

}, 100);

DUMP(a.Get());

a.Get() = 9.33262154439441e157

Exceptions thrown in Async work are propagated upon call to Get:

auto b = Async([] { throw "error"; });

try {

b.Get();

}

catch(...) {

LOG("Exception has been caught");

}

Exception has been caught

AsyncWork instances can be canceled (and are canceled in destructor if Get is not called on them):

{

auto c = Async([] {

for(;;)

if(CoWork::IsCanceled()) {

LOG("Work was canceled");

break;

}

});

Sleep(100); // give it chance to start

// c destructor cancels the work (can be explicitly canceled by Cancel method too)

}

Work was canceled

7.6 CoPartition

There is some overhead associated with CoWork worker threads. That is why e.g. performing a simple operation on the array spawning worker thread for each element is not a good idea performance wise:

Vector<int> data;

for(int i = 0; i < 10000; i++)

data.Add(i);

int sum = 0;

CoWork co;

for(int i = 0; i < data.GetCount(); i++)

co & [i, &sum, &data] { CoWork::FinLock(); sum += data[i]; };

co.Finish();

DUMP(sum);

sum = 49995000

Above code computes the sum of all elements in the Vector, using CoWorker job for each element. While producing the correct result, it is likely to run much slower than single-threaded version.

The solution to the problem is to split the array into small number of larger subranges that are processed in parallel. This is what CoPartition template algorithm does:

sum = 0;

CoPartition(data, [&sum](const auto& subrange) {

int partial_sum = 0;

for(const auto& x : subrange)

partial_sum += x;

CoWork::FinLock(); // available as CoPartition uses CoWork

sum += partial_sum;

});

DUMP(sum);

sum = 49995000

Note that CoWork is still internally used, so CoWork::FinLock is available. Instead of working on subranges, it is also possible to use iterators:

sum = 0;

CoPartition(data.begin(), data.end(), [&sum] (auto l, auto h) {

int partial_sum = 0;

while(l != h)

partial_sum += *l++;

CoWork::FinLock(); // available as CoPartition uses CoWork

sum += partial_sum;

});

DUMP(sum);

sum = 49995000

There is no requirement on the type of iterators, so it is even possible to use just indices:

sum = 0;

CoPartition(0, data.GetCount(), [&sum, &data] (int l, int h) {

int partial_sum = 0;

while(l != h)

partial_sum += data[l++];

CoWork::FinLock(); // available as CoPartition uses CoWork

sum += partial_sum;

});

DUMP(sum);

sum = 49995000

7.7 CoDo

An alternative to CoPartition is CoDo. In this pattern, the job is simply started in all threads and the code is responsible for scheduling the work. CoDo waits for all started threads to finish. Scheduling is the responsibility of client code, but can be easily managed using the std::atomic counter. This way, the overhead associated with creating lambdas and scheduling them is kept to the minimum (basically the cost of atomic increment). Once again, CoDo is based on CoWork, so CoWork::FinLock is available.

Vector<String> data;

for(int i = 0; i < 100; i++)

data.Add(AsString(1.0 / i));

double sum = 0;

std::atomic<int> ii(0);

CoDo([&] {

double m = 0;

for(int i = ii++; i < data.GetCount(); i = ii++)

m += atof(data[i]);

CoWork::FinLock();

sum += m;

});

DUMP(sum);

sum = 5.17737751763962

7.8 Parallel algorithms

U++ provides a parallel versions of algorithms where it makes sense. The naming scheme is 'Co' prefix before the name of algorithm designates the parallel version.

So the parallel version of e.g. FindIndex is CoFindIndex, for Sort it is CoSort:

Vector<String> x{ "zero", "one", "two", "three", "four", "five" };

DUMP(FindIndex(x, "two"));

DUMP(CoFindIndex(x, "two"));

CoSort(x);

DUMP(x);

FindIndex(x, "two") = 2

CoFindIndex(x, "two") = 2

x = [five, four, one, three, two, zero]

Caution should be exercised when using these algorithms - for small datasets, they are almost certainly slower than single-threaded versions.

Last edit by cxl on 09/08/2019. Do you want to contribute?. T++