Re: Known issues loading lots of objects withxml_iarchive

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Known issues loading lots of objects withxml_iarchive

Christopher Gillett
See I foolishly assumed that all your code was bug free :-) lol...sorry
for the nebulous report - here is something a bit more concrete.

The scenario as stated is loading a large number of objects that have
been previously serialized into a number of files.  After around 150
files have been processed I hit an archive exception and stream error.
This occurs only when using the XML archive...if I use the binary
archive I can process as many files as I'd like (and in fact I've tested
many more than I actually need).

The approach is "textbook" in that I've seen this same pattern many
times in sample code, tutorials, etc.  The only difference in the code
between binary and xml serialization is my load function:

XML:
#include <boost/archive/xml_oarchive.hpp>
#include <boost/archive/xml_iarchive.hpp>
void load( MyObjType &obj, const char *fileName )
{
  std::ifstream ifs( fileName );
  assert( ifs.good() );  
  boost::archive::xml_iarchive ia( ifs );
  ia >> BOOST_SERIALIZATION_NVP( obj );
}

Binary:
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
void load( MyObjType &obj, const char *fileName )
{
  std::ifstream ifs( fileName );
  assert( ifs.good() );  
  boost::archive::binary_iarchive ia( ifs );
  ia >> obj;
}

Each class I'm serializing has the usual "boilerplate" in the class
definition, for example:

#include <boost/serialization/nvp.hpp>
#include <boost/serialization/utility.hpp>
#include <boost/serialization/list.hpp>
#include <boost/serialization/version.hpp>
class MyClass : public MyClassBaseObject
{
    friend class boost::serialization::access;
    friend std::ostream & operator<<(std::ostream &os, const MyClass
&MyClassObj);

    template<class Archive>
    void serialize(Archive &ar, const unsigned int version)
    {
        ar & BOOST_SERIALIZATION_BASE_OBJECT_NVP( MyClassBaseObj );
        ar & BOOST_SERIALIZATION_NVP( thing1 );
        ar & BOOST_SERIALIZATION_NVP( thing2 );
        ar & BOOST_SERIALIZATION_NVP( etc );
    }
    ...
 };

I initially thought this was a file management problem, but since I can
load as many files as I'd like with the binary archive I suspect a bug
in the XML archive code.  While I can live with binary serialization if
necessary, my goal is XML serialization.

Any thoughts on this?  I can put together a totally contrived example
program that will demonstrate the failure if necessary.  I'm hoping to
hear that it's something stupid on my part.

Thanks,
Chris Gillett
 

> Date: Mon, 13 Mar 2006 18:40:32 -0800
> From: "Robert Ramey" <[hidden email]>
> Subject: Re: [Boost-users] Known issues loading lots of objects
>        withxml_iarchive?
> To: [hidden email]
> Message-ID: <dv5aeu$l6e$[hidden email]>
>
> There are no "known" issues.  In fact the  system has no known bugs
> either.  LOL - of course the function of this list is to address the
> problem.  I haven't heard of anyone having this problem.  You might
> investigate a little by seeing if occurs just in xml or in other
> archives
> as well.
>
> Robert Ramey
>
> Christopher Gillett wrote:
>> I am serializing a LOT of objects, one per file.  I can create
>> XML for
>> many thousands of objects with no issue.  However, when trying to
>> load a few hundred objects (using a separate program, btw) I am
>> hitting a boost::archive::archive_exception with a stream error while
>> loading around the 161st file, which I believe based on stack traces
>> is coming from within
>> boost::archive::basic_xml_grammar<char>::parse_end_tag ().
>>
>> The approach I'm taking is fairly textbook, so I'm curious if there
>> are known issues with loading a lot of files, etc.
>>
>> Any advice appreciate.
>>
>> Thanks,
>> Chris Gillett


_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Known issues loading lots ofobjects withxml_iarchive

Robert Ramey
Christopher Gillett wrote:
> See I foolishly assumed that all your code was bug free :-)

I won' t be so foolish!

> lol...sorry for the nebulous report - here is something a bit more
> concrete.
>
<snip>

Your code looks alright to me.  Here a couple of randome observations:

a) archives are reconstructed for each file so its hard to imagine
how one archive might have a "memory" of previous ones.
b) In your binary example, you excluded the NVP wrapper.  Try
including the NVP wrapper.  That is your experiment changes two
things at once NVP wrapper and archive type.  Try the binary
archive with the NVP wrapper which should reduce to a no-op
with no side effects.  If it fails that would be interesting to know.
c) XML parsing is done with the spirit library.  Since the parser
is constructed anew with each archive, it would be doubtful that
this would be a problem - but spirt is quite clever and its possible
that there might be a side-effect in there somewhere.
d) The serilization library uses a extended type system which keeps
information regarding the types for which serialization code is generated.
Its possible that the this might be subject to some side-effect that
occurs during the course of serialization.  It seems unlikely that
this would vary depending whether or not its and XML or binary
but its concievable.
e) the archives use custom code convert facets - this code uses
less often used aspects of the standard i/o streams and I've found a lot
of variation in this area among different implementations of the
standard library.  xml and binary use different code_cvt facets.
This would be consistent with the observed behavior that the
problem occurs with only one sort of archive -  ( you might try text
archives which uses the same stream i/o as xml) and with the
fact that the error is detected in the stream i/o library - the
serialization system just passes it on with an exception.

So I do suspect you may well have come upon a legitimate bug
and look forward to your report regarding its source and the
required fix.

Thanks for your help.

Robert Ramey

>
> I initially thought this was a file management problem, but since I
> can load as many files as I'd like with the binary archive I suspect
> a bug in the XML archive code.  While I can live with binary
> serialization if necessary, my goal is XML serialization.
>
> Any thoughts on this?  I can put together a totally contrived example
> program that will demonstrate the failure if necessary.  I'm hoping to
> hear that it's something stupid on my part.
>



_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users