← Back to team overview

commonsense team mailing list archive

[Bug 373398] Re: Effective 2 GB limit on blend input

 

Ken fixed this. Input tensors can now be stored in PyTables (for a big
time/space tradeoff).

** Changed in: divisi
       Status: Confirmed => Fix Committed

-- 
Effective 2 GB limit on blend input
https://bugs.launchpad.net/bugs/373398
You received this bug notification because you are a member of
Commonsense Computing, which is the registrant for Divisi.

Status in Divisi: Fix Committed

Bug description:
The blending code currently multiplies all the input data, and puts it into a sparse matrix, before running the blend SVD.

There may in fact be multiple copies of all the data: the original input tensors, the blend tensor, and the CSCMatrix.

This quickly hits the 2GB memory limit in 32-bit Python (or, equivalently, quickly eats up 4GB or more of ram in 64-bit Python). We need a way to conserve memory. Some possibilities:

* Incremental approaches (perhaps using Jayant's 'hit all the zeros at once' idea to make incremental SVD spiky like Lanczos SVD is)
* SVD of SVDs (add the svd.u's together, not the input matrices, and svd again; sigma and v need to be reconstructed in other ways)



References