commonsense team mailing list archive
-
commonsense team
-
Mailing list archive
-
Message #00056
[Bug 373398] Re: Effective 2 GB limit on blend input
Ken fixed this. Input tensors can now be stored in PyTables (for a big
time/space tradeoff).
** Changed in: divisi
Status: Confirmed => Fix Committed
--
Effective 2 GB limit on blend input
https://bugs.launchpad.net/bugs/373398
You received this bug notification because you are a member of
Commonsense Computing, which is the registrant for Divisi.
Status in Divisi: Fix Committed
Bug description:
The blending code currently multiplies all the input data, and puts it into a sparse matrix, before running the blend SVD.
There may in fact be multiple copies of all the data: the original input tensors, the blend tensor, and the CSCMatrix.
This quickly hits the 2GB memory limit in 32-bit Python (or, equivalently, quickly eats up 4GB or more of ram in 64-bit Python). We need a way to conserve memory. Some possibilities:
* Incremental approaches (perhaps using Jayant's 'hit all the zeros at once' idea to make incremental SVD spiky like Lanczos SVD is)
* SVD of SVDs (add the svd.u's together, not the input matrices, and svd again; sigma and v need to be reconstructed in other ways)
References