Page 1 of 1

Reliability with a large library

PostPosted: Wed Apr 06, 2016 8:30 pm
by falk
Hi,

I am generally pleased with Serviio Pro. However, I am running into a couple of problems, most important ones first. This is for Serviio 1.6 on Mac OSX 10.10.

1. Whenever Serviio completes its rescan (auto or manual), the media tree isn't in sync. Most importantly, servioo never deletes old entries which do no longer exist.

2. In many cases after or during the rescan, the server crashes (have to restart via cosole).

3. To get it there even close, I'd to increase Java Heap space to 2 GB, i.e., 4x its default value!

4. Creating thumbnails for larges images (say, 80MP) takes *ages*, a full scan takes about a week (and a lot of Java heap space)! This is ridiculous, what code does serviio use to create a thumbnail? Normally, extracting a thumb from a JPG is a fast operation, not so with serviio.

5. I have funny entries like movies I never added to the library, in the "movies" content category. Some of these entries link to files which do not exist.

Please help. As this spoils an otherwise good user experience.

What info do I need to provide to sort this out?
I have automatic rescan enabled, also search for updated files enabled. As well as thumbnail generation enabled. The media is on a local Thunderbolt drive, derby is on the local SSD. The items not being deleted are mostly in one of the top media folders being added as "shared folder".

Kind regards, Falk

Re: Reliability with a large library

PostPosted: Wed Apr 06, 2016 9:03 pm
by zip
Large images do need a lot of heap space, as the java algorithm to resize the images requires to load the image into memory as a bitmap, do the resize and save. It's the only way AFAIK without using a more efficient tool like ImageMagic.

Re: Reliability with a large library

PostPosted: Wed Apr 06, 2016 9:52 pm
by falk
Hi zip,

the memory consumption problem isn't top on my list. But it's the only topic I can now reply to ;)

I am aware of Java image processing options.
- You could call ImageMagic as external command.
- Or read the images into a BufferedImage object (which uses 305 MB heap for my 80MP images rather than more than 1GB as it is now). And then make a sparse loop only reading say, every 4th line and 4th row to draw another BufferedImage of 1/16 the original size (for originally large images). And scale from there as you do now. Would be more than 10 times faster, using 1/4 the memory and w/o any visible effect on the thumbnail quality.
- Or extract the thumbnail from the JPG EXIF meta data which avoids to even read the full file!

As it is now, it makes me think Serviio is still in prototype state.

Nevertheless, the reliability issue bothers me more. I see a number of Java exceptions being thrown (but not always). What shall I do?

Kind regards,
Falk

Re: Reliability with a large library

PostPosted: Wed Apr 06, 2016 10:32 pm
by zip
Post the exceptions here.

Re: Reliability with a large library

PostPosted: Thu Apr 07, 2016 5:16 pm
by falk
Hi Petr,

I've switched the logging level to DEBUG and triggered another rescan (btw, after launchctl unload, I have to additionally kill the Java process -- it won't quit by itself). Also, I increased the logfile sizes. I am getting 1.2 GB of log but none of the entries is related to expired media items. All exceptions in the log are related to meta data extraction or the JPG libraries' read function throwing exceptions. As I cannot share this much of log data here in the forum, I digged deeper.

There is *ANOTHER* log file at /var/log/serviio/serviio.log and it contains additioal exceptions (which seem to go uncatched by the Serviio code and should be treated and turned to ERRORs in the normal log file at least). The most important one is this:
  Code:
Exception in thread "ActionDistributor: Thread-2" Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
   at java.util.LinkedHashMap$LinkedValues.iterator(LinkedHashMap.java:588)
   at org.simpleframework.transport.reactor.ActionDistributor.cancel(ActionDistributor.java:365)
   at org.simpleframework.transport.reactor.ActionDistributor.execute(ActionDistributor.java:188)
   at org.simpleframework.transport.reactor.ActionDistributor.run(ActionDistributor.java:172)
   at java.lang.Thread.run(Thread.java:745)
java.lang.OutOfMemoryError: Java heap space
   at org.apache.derby.impl.jdbc.Util.javaException(Unknown Source)
   at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
   at org.apache.derby.impl.jdbc.EmbedResultSet.noStateChangeException(Unknown Source)
   at org.apache.derby.impl.jdbc.EmbedResultSet.getString(Unknown Source)
   at org.apache.derby.impl.jdbc.EmbedResultSet.getString(Unknown Source)
   at org.serviio.library.dao.MediaItemDAOImpl.initMediaItem(MediaItemDAOImpl.java:454)
   at org.serviio.library.dao.MediaItemDAOImpl.mapResultSet(MediaItemDAOImpl.java:433)
   at org.serviio.library.dao.MediaItemDAOImpl.getMediaItemsInRepository(MediaItemDAOImpl.java:195)
   at org.serviio.library.local.service.MediaService.getMediaItemsInRepository(MediaService.java:83)
   at org.serviio.library.local.indexing.LibraryOneTimeScanner.searchForRemovedAndUpdatedFiles(LibraryOneTimeScanner.java:228)
   at org.serviio.library.local.indexing.LibraryOneTimeScanner.searchForUpdatesAndRemovals(LibraryOneTimeScanner.java:133)
   at org.serviio.library.local.indexing.LibraryOneTimeScanner.scanLibrary(LibraryOneTimeScanner.java:82)
   at org.serviio.library.local.indexing.LocalLibraryManager.performManualScan(LocalLibraryManager.java:186)
   at org.serviio.library.local.indexing.LocalLibraryManager.startLibraryScanning(LocalLibraryManager.java:117)
   at org.serviio.MediaServer.main(MediaServer.java:162)


So, in org.serviio.library.dao.MediaItemDAOImpl.getMediaItemsInRepository(), Serviio is trying to read the ENTIRE existing database into memory all at once and it fails at that. I thought that I already increased the memory footprint beyond reasonable (for thumbnail creation) but obvious not.

Analyzing the exception, the problem is most likely with a full iteration over a large ResultSet. I don't know for Derby, but many JDBC implementations are poor in that they keep a reference to all rows already visited in the ResultSet object (to implement ResultSet.previous() w/o a need to visit the db again). Making it grow in memory even if the ResultSet consumer remains small (like just filling a HashSet). This can be overcome by splitting the single loop over the ResultSet into many smaller ones, where each smaller loop uses ResultSet.absolute() (or an offset via SQL) to skip over data already read and possibly using ResultSet.setFetchSize() to avoid reading too many data twice. The consumer would still be able to see all data this way. The small loop size can then be set to fit into the 512MB default heap size of serviio. My feeling is that small loop size (fetch size) would be around 250k records. Note that media archives can encompass millions of entries.


Observing Serviio uses a 64 Bit JVM, I increased the memory to 4GB and now it worked.


The Delta scanner now runs with a memory footprint of 3.5 GB.

I.e., I solved the main problem I was experiencing. The Serviio content is now in sync with the actual media files.

But a more incremental algorithm to iterate over existing entries, or a more defensive approach may be appropriate. Or at least, to manually garbage collect to shrink the memory footprint after each run of the delta scanner.

Any ideas how to reduce the Serviio memory footprint?

Moreover, any feedback on my other problems (e.g., funny entries in the "movies" category, like 30 entries for some "Bruce Lee" movie where I have none)? Meanwhile, I checked they are coming from VTS_x_y.VOB entries of archived and non-encoded DVDs.


Kind regards,
Falk

Re: Reliability with a large library

PostPosted: Thu Apr 07, 2016 7:07 pm
by zip
Thanks for the investigation. I've raised a ticket to look at it for a future version: https://bitbucket.org/xnejp03/serviio/i ... -footprint

For the wrong titles, it's due to how you name your files / folders. Serviio goes to online databases to try to find the metadata. http://serviio.org/index.php?option=com ... icle&id=12

Re: Reliability with a large library

PostPosted: Thu Apr 07, 2016 11:51 pm
by falk
Thanks for the ticket, Petr.

Maybe, add a comment to catch OutOfMemoryError and log an ERROR with instruction how to increase heap to the standard logfile.

Wrt wrong meta data: I did read that page but that doesn't explain what I got. There was no hint to any movie of that sort in the pathname. I assume Serviio falsly detected the DVD .vob files as series episode files and made a too loose lookup to the series online database. Or the series online database always returns "something" even if the search string doesn't match.

Anyway, the wrong titles have gone away with the other fix as meanwhile, the DVD (Brokeback Mountain and named correctly) was transcoded and deleted anyway. I have no clue how Bruce Lee could get involved except for the first two letters ...