lo/core - LibreOffice 核心代码仓库

diff options

author	Lionel Elie Mamane <lionel@mamane.lu>	2012-06-04 17:54:30 +0200
committer	Lionel Elie Mamane <lionel@mamane.lu>	2012-06-04 19:22:48 +0200
commit	e581bef6dfc03d0bab9de1485c6f6cdcd034d581 (patch)
tree	dfd8d19e9ea77ec981cdb7b6254dd5a8d07a640d /readlicense_oo/README
parent	60f5a9d372eb190038a1d37afa48bcac0dbe1875 (diff)

i#102625 avoid fetching same row twice in different queries

We do a "SELECT * FROM table" just to fetch the primary key columns; so reuse the same XResultSet to fetch all columns. Else, we immediately issue a "SELECT * FROM table WHERE primary_key=current_value" to read the other columns, which is wasteful and particularly silly. Commit 1ae17f5b03cc14844fb600ca3573a96deb37ab3b already tried to do that, but was essentially reverted piecewise because it caused fdo#47520, fdo#48345, fdo#50372. Commit c08067d6da94743d53217cbc26cffae00a22dc3a thought it did that, but actually reverted commit 1ae17f5b03cc14844fb600ca3573a96deb37ab3b. This implementation fetches the whole current row and caches it in memory; only one row is cached: when the current row changes, the cache contains the new current row. This could be problematic (wrt to memory consumption) if the current row is big (e.g. with BLOBs) and nobody is interested in the data anyway (as would often be the case with BLOBs). Note that because of our "SELECT *", the driver most probably has it in memory already anyway, so we don't make the situation that much worse. This could be incrementally improved with a heuristic of not preemptively caching binary data (and also not LONGVARCHAR / TEXT / MEMO / ...); a getFOO on these columns would issue a specific "SELECT column FROM table WHERE primary_key=current_value" each time. The *real* complete fix to all these issues would be to not do "SELECT *" at all. Use "SELECT pkey_col1, pkey_col2, ..." when we are only interested in the key columns. As to data, somehow figure out which columns were ar interested in and "SELECT" only these (and maybe only those with "small datatype"?). Interesting columns could be determined by our caller (creator) as an argument to our constructor, or some heuristic (no binary data, no "big" unbound data). Also be extra smart and use *(m_aKeyIter) when getFOO is called on a column included in it (and don't include it in any subsequent SELECT). However, there are several pitfalls. One is buggy drivers that give use column names of columns that we cannot fetch :-| Using "SELECT *" works around that because the driver there *obviously* gives us only fetchable columns in the result. Another one is the very restrictive nature of some database access technologies. Take for example ODBC: - Data can be fetched only *once* (with the SQLGetData interface; bound columns offer a way around that, but that's viable only for constant-length data, not variable-length data). This could be addressed by an intelligent & lazy cache. - Data must be fetched in increasing order of column number (again, this is about SQLGetData). This is a harder issue. The current solution has the nice advantage of completely isolating the rest of LibO from these restrictions. I don't currently see how to cleanly avoid (potentially unnecessarily) caching column 4 if we are asked for column 3 then column 5, just in case we are asked for column 4 later on, unless we issue a specific "SELECT column4" later. But the latter would be quite expensive in terms of app-to-database roudtripe times :-( and thus creates another performance issue. Change-Id: I999b3f8f0b8a215acb390ffefc839235346e8353

Diffstat (limited to 'readlicense_oo/README')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: