DAS-BAHN-FORUM.de

I've noticed that some large layout files take a little too long to load. For instance, on some recent WN3 bundles, the "Loading" indicator reaches 11MB in less than a second, yet by the time it gets to 13MB it takes almost four seconds just to open up a 16kB block. This suggests something wrong somewhere (in before someone says "Yes, the layout's too big!"), since reading data from a file and putting it into memory is a linear-time operation (i.e. twice as much data takes twice as long), not the exponential-time operation it appears to be here (i.e. twice as much data takes more than twice as long). If I had to guess, I'd think that with each 16kB block the entire file is being read sequentially from the beginning for each block, which is very wasteful.

Most likely, you are either running out of RAM - the layouts are saved in compressed form, thus they need to be uncompressed while reading - and windows needs to move some stuff onto your harddrive, or due to the compression you get to a part that is heavily compressed and it simply takes a little longer to uncompress that part of the layout.

My guess would be the latter, because it also happens to me, despite having 4GB RAM, of which the layout only takes about 400MB, and my total RAM usage is below 2 GB.

Edited to include more information.

Chris hat geschrieben:I've noticed that some large layout files take a little too long to load. For instance, on some recent WN3 bundles, the "Loading" indicator reaches 11MB in less than a second, yet by the time it gets to 13MB it takes almost four seconds just to open up a 16kB block. This suggests something wrong somewhere (in before someone says "Yes, the layout's too big!"), since reading data from a file and putting it into memory is a linear-time operation (i.e. twice as much data takes twice as long), not the exponential-time operation it appears to be here (i.e. twice as much data takes more than twice as long). If I had to guess, I'd think that with each 16kB block the entire file is being read sequentially from the beginning for each block, which is very wasteful.

The file is also processed, not only copied into memory. So much more data is calculated to increase simulation speed later on, I guess. (Most speed increasements in the past seemed to increase memory consumption). Different sorts of data (landscape, signal data, trains etc.) are maybe the reason for different loading speed, as more or less processing has to be done.

DerPhysiker hat geschrieben:Most likely, you are either running out of RAM

Hmm.... perhaps 2GB physical and 2GB swap wasn't enough after all

Low RAM might be an explanation for a sudden slowing, but this is a linear slowing - each block actually takes longer than the previous, and the increase is steady. I'm able to reliably reproduce the each-block-loads-proportionally-more-slowly-than-the-last on machines with variable amounts of RAM and swap. With less RAM, the time increments are a little longer.

micha88 hat geschrieben:The file is also processed, not only copied into memory.

This doesn't explain a gradual slowdown. Data processing is generally either linear or random, not exponential. In mathematical terms, processing the nth block of data should be O(1) (total time O(n)) rather than O(n) (total time O(exp n)). The position of a block of data should not affect how long it takes to process that specific block alone (the important part - not counting all the blocks already done or still to do).

As for the different sorts of information being used, I believe the the part where things are initialised is labelled as such, and uses a percentage indicator (shortly followed by a slew errors for where I've got out-of-date graphics files

).

In summary, my theories on this are:

data is block-compressed and therefore can't be retrieved mid-stream(solution: read the entire stream into memory and then process it - storing a 16MB blob in memory is not an issue given the minimum requirement is now Windows 2000);
bug in code causing file to be unintentionally read from the start each time; and
bug in logic causing file to be "intentionally" read from the start each time.

Sadly, my ability to understand a disassembly is non-existent, otherwise I'd try tracing through myself to see what happens.

Granted, this is only happening with really big layouts, but the exponential nature means that throwing more RAM and faster hardware at it will not improve things - it's also a safe assumption that the size of a layout file only ever moves in one direction (an especially safe one with WN).

It might have something to do with the position of the data in the file. At the beginning the layout itself is saved, later the data of functional elements and (afaik) at last the position and function of the trains. The layout is quite a simple projection of data from file to the engine, but the latter take more time and are more complicated. Therefore they need more processing time...

Greatings, sepruecom

Hello,

Chris hat geschrieben:
DerPhysiker hat geschrieben:Most likely, you are either running out of RAM
Hmm.... perhaps 2GB physical and 2GB swap wasn't enough after all

That should be enough. I don't know larger examples.

Chris hat geschrieben: Low RAM might be an explanation for a sudden slowing, but this is a linear slowing - each block actually takes longer than the previous, and the increase is steady. I'm able to reliably reproduce the each-block-loads-proportionally-more-slowly-than-the-last on machines with variable amounts of RAM and swap. With less RAM, the time increments are a little longer.

micha88 hat geschrieben:The file is also processed, not only copied into memory.
This doesn't explain a gradual slowdown. Data processing is generally either linear or random, not exponential. In mathematical terms, processing the nth block of data should be O(1) (total time O(n)) rather than O(n) (total time O(exp n)). The position of a block of data should not affect how long it takes to process that specific block alone (the important part - not counting all the blocks already done or still to do).

The data -as stored in the layout file- and the data -as stored in memory- are quite different things.

The file is a linear stream of all that is needed as information, with low redundance. Connections between objects are shown as handles (each thing that has a name, also has an internal handle). These data can be read and written linear, but it is hard to search a certain entry inside them, e.g. a certain signal or turnout. However, while stored on the disk, such a search is never needed.

The data in memory are organized in a more complex way. There are many objects, that have connections between them. While running the simulation of a large layout, it is necessary to find connected objects quickly, and not by search in a long linear data list. For that, there exist many direct links. Ofcourse, they are not stored in the file, because it would be nonsense to store address pointers in a file. But, on loading, these connections are created "on-the-fly". For instance, all signal systems own a list of pointers to the elements they are assigned, and vice-versa the elements are linked to the signal systems. The time needed to create these data, is not linear depending on the amount. In result, you may load a large layout quickly when it does not contain signals, but you may need much more time for a smaller layout file that contains some thousand signals with complicated connections.

Also, in some case BAHN makes tests for data integrity on loading, i.e. for unique handles or unique names. To test whether something is unique, you need to check all the other objects that you already have loaded. This is not linear depending on number of objects.
However, the most of these tests are turned off when loading files of current file version.

Similar is sorting: Some objects are sorted by name, number or co-ordinates, and the search algorithms rely on that. However, there are known some situations of data mismatch by bugs in programming of earlier versions. In result, it is a good way to re-sort the objects on loading. However, for many objects of same type this may need some time, and ofcourse, it is not linear.

Chris hat geschrieben:
As for the different sorts of information being used, I believe the the part where things are initialised is labelled as such, and uses a percentage indicator (shortly followed by a slew errors for where I've got out-of-date graphics files ).

A percentage indicator needs to know the amount of data before starting, to define what "100%" really means. However, it is not an information that makes really sense when you take the length of a BAHN layout file.

Chris hat geschrieben: Granted, this is only happening with really big layouts, but the exponential nature means that throwing more RAM and faster hardware at it will not improve things...

Yes. Many extensions and algorithms result in something greater than O(1).

Greetings
Jan B.

Also, in some case BAHN makes tests for data integrity on loading, i.e. for unique handles or unique names. To test whether something is unique, you need to check all the other objects that you already have loaded. This is not linear depending on number of objects.

A lot of this can possibly be saved by running in passes rather than building structures on-the-fly. If one knows the names of all the objects, finding duplicates is trivial, and probably a lot faster than doing the dupe check every time an object or group of objects is pulled in (at most n-1 comparisons, IIRC). A logical sequence might be decompress -> decode -> sanity-check (if needed) -> connect, or something. Sanity checks are generally less complex when performed on an entire dataset rather than incrementally. Creating throwaway structures to simulate indices may take memory on start-up, but can be disposed of later - if done well, the sorting is O(n log n) and the searching O(log n).

Things that may be considered:

Small layouts load quickly enough on most equipment (case-in-point: my own layout weighs in at 1MB uncompressed, and takes no more than a second on my 5yo box) that a slight degradation probably isn't noticeable to human eyes.
In general, a layout will only ever get larger and more complex, not smaller or simpler.
Affected layouts are in frequent, heavy use (the WN in particular). That said, I've only seen four layouts where this is a problem (WN3 being the worst).
You still end up having to do the checks for known bugs in older versions, which are probably numerous given (IIRC) BAHN predates Windows.
It's a substantial change, and therefore sounds an awful lot like work

All of which reminds me that while stuck in the longer pauses, the program's windows (including the progress box) are unresponsive. Before now I've killed it thinking it had hung. Is there any way around this?

Of course, if I'm speaking out of the wrong end (again), feel free to ignore me. The €0,02 is not refundable, however.

DAS-BAHN-FORUM.de

Slow loading

Slow loading

Re: Slow loading

Re: Slow loading

Re: Slow loading

Re: Slow loading

Re: Slow loading

Re: Slow loading