commonsense team mailing list archive

Thread
Date

[Bug 373398] Re: Effective 2 GB limit on blend input

To: commonsense@xxxxxxxxxxxxxxxxxxx
From: Rob Speer <rspeer@xxxxxxx>
Date: Tue, 04 Aug 2009 20:50:32 -0000
Reply-to: Bug 373398 <373398@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Ken fixed this. Input tensors can now be stored in PyTables (for a big
time/space tradeoff).

** Changed in: divisi
       Status: Confirmed => Fix Committed

-- 
Effective 2 GB limit on blend input
https://bugs.launchpad.net/bugs/373398
You received this bug notification because you are a member of
Commonsense Computing, which is the registrant for Divisi.

Status in Divisi: Fix Committed

Bug description:
The blending code currently multiplies all the input data, and puts it into a sparse matrix, before running the blend SVD.

There may in fact be multiple copies of all the data: the original input tensors, the blend tensor, and the CSCMatrix.

This quickly hits the 2GB memory limit in 32-bit Python (or, equivalently, quickly eats up 4GB or more of ram in 64-bit Python). We need a way to conserve memory. Some possibilities:

* Incremental approaches (perhaps using Jayant's 'hit all the zeros at once' idea to make incremental SVD spiky like Lanczos SVD is)
* SVD of SVDs (add the svd.u's together, not the input matrices, and svd again; sigma and v need to be reconstructed in other ways)

References

[Bug 373398] [NEW] Effective 2 GB limit on blend input
From: Rob Speer, 2009-05-07