On Fri, Feb 21, 2014 at 7:34 AM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 02/20/2014 08:08 PM, Nikolai Gorchilov wrote:
>> On Fri, Feb 21, 2014 at 12:13 AM, Alex Rousskov wrote:
>>> On 02/15/2014 06:12 AM, Nikolai Gorchilov wrote:
>>>> I'm trying to avoid the following scenario (excerpt from store.log):
>>>>
>>>> 1392406208.398 SWAPOUT 00 00000000 8C2B9C51268EFEEDEB33FB9EC53030A1
>>>> 200 1392406217 1382373187 1394998217 image/jpeg 21130/21130 GET
>>>> http://www.gnu.org/graphics/t-desktop-4-small.jpg
>>>> 1392406242.459 SWAPOUT 00 00000000 8C2B9C51268EFEEDEB33FB9EC53030A1
>>>> 200 1392406217 1382373187 1394998217 image/jpeg 21130/21130 GET
>>>> http://www.gnu.org/graphics/t-desktop-4-small.jpg
>>>>
>>>> First request was served by kid1. It fetched the object by
>>>> HIER_DIRECT, memory cached it, and stored it to it's storage (say /S1).
>>>> Seconds later, the same request arrives to kid2. It retrieves the
>>>> object from shared memory (hierarchy code NONE), then swaps it out to
>>>> it's own storage (say /S2).
>>>>
>>>> The question is how to prevent kid2 from saving the duplicate object?
>>>> Is there any mechanism other then switching memory_cache_shared off?
>
>
>>> Yes, rock store. If you get the above behavior while using rock
>>> cache_dirs (shared by all workers) and no other cache_dirs, then it is a
>>> bug.
>
>
>> Nope, it's aufs.
>
>
> With aufs, kid2 does not know that kid1 already has the object in its
> kid1-specific cache_dir because aufs store is not SMP-aware: You are
> forced to use kid-specific aufs cache_dirs to avoid cache corruption.
Unfortunately yes :)
> Besides making aufs SMP-aware, it would be possible to reduce such
> duplicate storage by hacking the shared memory cache index to keep
> information about the cache_dir holding the object (if any), but it is
> not easy to do, it will not work reliably, and you will still face
> similar problems when the object is purged from the memory cache or is
> never stored there.
Seems switching shared_memory to off is the only "stable" solution for now.
Moving to large rock is also an option, but I'm afraid to do so due to
the heavy trunk refactoring nowadays. How feasible is back-porting
only the bare minimum for large rock support from trunk to 3.4 in your
opinion?
Best,
Niki
Received on Fri Feb 21 2014 - 13:30:21 MST
This archive was generated by hypermail 2.2.0 : Fri Feb 21 2014 - 12:00:06 MST