Question regarding boost binary serializer which is eating 90GB of memory, while re-reading the same content.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Question regarding boost binary serializer which is eating 90GB of memory, while re-reading the same content.

Boost - Users mailing list
Hi All,

Greetings for the day.

I am trying to understand the behavior of Boost's binary serializer(Boost version 1.65). Please consider the below example, where I am inserting one record into the file, and trying to read the same record 5 times.

#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/serialization/string.hpp>
#include <fstream>
#include <iostream>

using namespace std;
using namespace boost::archive;
class logEntry {
 private:
size_t m_txID;
string m_jsonStr;

friend class boost::serialization::access;
template <typename Archive>
friend void serialize( Archive &ar, logEntry &l, const unsigned int version );

 public:
logEntry() {
m_txID = 0;
m_jsonStr = "";
}
logEntry( size_t id, const string &val ) {
m_txID = id;
m_jsonStr = val;
}
string getJsonValue() {
return m_jsonStr;
}

size_t getTxId() {
return m_txID;
}
};

template <typename Archive>
void serialize( Archive &ar, logEntry &l, const unsigned int version ) {
ar &l.m_txID;
ar &l.m_jsonStr;
}

size_t prevReadPos = 0;

void save( size_t n ) {
ofstream        file{"/tmp/test.bin", ios::binary | ios::trunc};
binary_oarchive oa{file};

// Save n records
for ( int i = 0; i < n; i++ )
oa << logEntry( i, "{Some Json String}" );

file.flush();
file.close();
}

// Load data batch wise
void load( size_t bsize ) {
ifstream        file{"/tmp/test.bin", ios::binary};
binary_iarchive ia{file};

// Record file length
size_t fileEnd;
size_t beg = file.tellg();

file.seekg( 0, ios::end );
fileEnd = file.tellg();
file.seekg( beg, ios::beg );


logEntry l;
for ( size_t i = 0; i < bsize; i++ ) {
ia >> l;
// Reset the file read position to the beginning of the file to read the same line one more time
file.seekg( beg, ios::beg );
}

prevReadPos = file.tellg();
file.close();
}

int main() {
// Saving only one record
save( 1 );
while ( 1 ) {
// Trying to read the same record 5 times
load( 5 );
sleep( 5 );
}
}



From the above code, I am able to read the record one time and whenever I re-set the file pointer back to the beginning of the file, the program "Crashes with SIGKILL".
And also, I observed that this process it consumed around 90GB of memory(I got this reading from xCode while debugging the code).

My dev environment is as below.

clang --version
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin


Kindly let me know if you need any more information from my side.
--

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding boost binary serializer which is eating 90GB of memory, while re-reading the same content.

Boost - Users mailing list
On 1/19/18 11:52 PM, dinesh kumar via Boost-users wrote:
> Hi All,
>
> Greetings for the day.
>
> I am trying to understand the behavior of Boost's binary
> serializer(Boost version 1.65). Please consider the below example, where
> I am inserting one record into the file, and trying to read the same
> record 5 times.
>

It would not occur to me to use the serialization library in this case.
The serialization library saves/restores a whole data structure of
arbitrary complexity.  It does it with one call.  It doesn't really
support "picking apart" the saved file.  It's not a file protocol - it's
much, much more than that.  If you intervene in the process, you'll
likely be surprised at what you get.

Robert Ramey

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding boost binary serializer which is eating 90GB of memory, while re-reading the same content.

Boost - Users mailing list
On Sat, Jan 20, 2018 at 10:12 PM, Robert Ramey via Boost-users <[hidden email]> wrote:
On 1/19/18 11:52 PM, dinesh kumar via Boost-users wrote:
Hi All,

Greetings for the day.

I am trying to understand the behavior of Boost's binary serializer(Boost version 1.65). Please consider the below example, where I am inserting one record into the file, and trying to read the same record 5 times.


It would not occur to me to use the serialization library in this case. The serialization library saves/restores a whole data structure of arbitrary complexity.  It does it with one call.  It doesn't really support "picking apart" the saved file.  It's not a file protocol - it's much, much more than that.  If you intervene in the process, you'll likely be surprised at what you get.

Agreed. I was making my hands dirty with serialization and thought I found a serious problem with the library, hence reported to the community.
But, I am not using the serialization to re-read the same content as it was a just experiment which was exploded in my machine, while doing the same.

--Dinesh

Robert Ramey

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users



--

_______________________________________________
Boost-users mailing list
[hidden email]
https://lists.boost.org/mailman/listinfo.cgi/boost-users