It's so easy to pick the wrong one #JoelKallmanDay

I started this blog enthusiastically. And then life happened. Today is Joel Kallman Day, so it seems a good day to pick this up again.

Some time ago I ran into a problem with my code, where the cache for a result cache function wasn’t invalidated if DML was performed on the table the function selected from.
This led to the function returning old values because it was under the impression the cache was still valid.
I’ve checked and seen this behavior in all versions between 12.1.0.2 and the currently highest available version 23c FREE.

In the end it turned out to be caused by a “for update” clause that I unintentionally copied when I put the query into the function from someplace else.
So, it wouldn’t have happened if paid more attention to what I was copying.
Nevertheless, the combination of a result cache and a “for update” is as far as I know perfectly valid, although it implies some bad practice.
The behavior was (for me) unexpected, so it took me a couple of hours to find the cause, because I was concentrating on what I was doing wrong in the rest of the code.
Because of that and the fact that in rare occasions you might actually want the “for update” clause in the query of your function I decided to write this blog post as a small warning.

(See how easy it is to pick the wrong one ?)

First, for those who might not be very familiar with the result cache, I’ll show how it is supposed to work.
Then I will show the case where it doesn’t work as expected for me.

Go straight to

What we would expect
How to break the result cache
Conclusion

What we would expect

After a resultcache function is executed its return value is cached, so if it gets called again with the exact same parameter values it is not executed, but rather the cached value is immediately returned. This can obviously have a huge performance benefit.
But what if the function returns a value that depends on the data in one or more tables?
Well, Oracle handles that pretty well. As long as nothing changes to the tables the function depends on the cache can be used (if a value for the combination of parameter values has been cached before, of course).
The moment a table that the function depends on, or its data, changes the cache is invalidated, meaning that for every unique combination of parameter values used, the function will be actually executed again once and the new return value will be cached in the fresh cache. Let’s create a table that we can use in our result cache function

create table ero_tbl
as
select level                         as id
     , 'Original Value of '||level   as value
  from dual
connect by level <= 10
;

create table ero_tbl

select level as id

, 'Original Value of '||level as value

from dual

connect by level <= 10

;

The table has 10 rows, with IDs 1 through 10 and a column VALUE that contains ‘Original value’ for each row.

select *
  from ero_tbl
 order by id
;

select *

from ero_tbl

order by id

;

Now we need a result cache function that depends on this table.
I will create a simple function that accepts an ID as parameter and will return the value of the VALUE column for that ID.
And the function will put a message on screen so we can see if it’s been executed when called.

Yes, I know, the function should really be in a package, not standalone.
The function should at least handle the no_data_found exception.
Yadi-yadi-ya…… All that is not the point here, so for simplicity sake, this is the function we will work with.

create or replace function ero_rcf
(p_id  in  ero_tbl.id%type
)
return ero_tbl.value%type
result_cache
is
  l_result  ero_tbl.value%type;
begin
  dbms_output.put_line ('====> RCF executed for id = '||p_id);

  select value
    into l_result
    from ero_tbl
   where id = p_id
  ;
  return l_result;
end;
/

create or replace function ero_rcf

(p_id in ero_tbl.id%type

)

return ero_tbl.value%type

result_cache

l_result ero_tbl.value%type;

begin

dbms_output.put_line ('====> RCF executed for id = '||p_id);

select value

into l_result

from ero_tbl

where id = p_id

;

return l_result;

end;

We’re all set.
Let’s call this function for IDs 1 and 3 each multiple times, and see what happens.

begin
  dbms_output.put_line ('Value of id 1 = '||ero_rcf(1));
  dbms_output.put_line ('Value of id 1 = '||ero_rcf(1));
  dbms_output.put_line ('Value of id 1 = '||ero_rcf(1));

  dbms_output.put_line ('Value of id 3 = '||ero_rcf(3));
  dbms_output.put_line ('Value of id 3 = '||ero_rcf(3));
  dbms_output.put_line ('Value of id 3 = '||ero_rcf(3));
end;
/

begin

dbms_output.put_line ('Value of id 1 = '||ero_rcf(1));

dbms_output.put_line ('Value of id 3 = '||ero_rcf(3));

end;

As we can see from the output, the first time the function got called with parameter value 1 it was executed and it returned the VALUE of that ID.
Subsequent calls of the function with parameter value 1 did NOT execute the function, but the correct value was returned from the cache.

The same is true for parameter value 3. First time the function is called with *that* value it’s executed again. And that’s not just within that execution of that plsql block. If we run it again, we get

The function is not executed anymore at all, because the results are still in the cache.

The cached return values are available for any session until the cache gets invalidated or pushed out of the cache due to it filling up with newer cached values. But? But? What happens if a row in the table is changed?
Well, that means the cache no longer reflects the actual value(s) in the table that is used by the function.
Oracle is intelligent enough to detect the DML and invalidate any cache that depends on the effected table. This enforces that the function will be executed again the first time it is called with a certain set of parameter values.

update ero_tbl
   set value = 'UPDATED value of 3'
 where id    = 3
;

commit;

update ero_tbl

set value = 'UPDATED value of 3'

where id = 3

;

commit;

If we now run the exact same anonymous block as before, that calls the result cache function we see:

For ID = 1 the function returns the exact same value as before, because its data did not change.
But Oracle does not know exactly what the function does. It only knows the result of the function depends on the table, and the table is changed, so the complete result cache for the function is invalidated. This leads to the function being executed again once for ID = 1 and once for ID = 3.
Indeed we see that for ID = 3 a different value is returned after the update.

Go Back to top

How to break the result cache

In this part the result cache function will be a little bit different.
As I said in the intro of this post a cursor that was created within other functionality needed to be moved to my result cache function.
The other functionality was to be removed and the function was to largely take over the functionality.
The cursor in the function, however, needed the same functionality as in the old code.

So what does a lazy developer do in such a case?
He copy-pastes the cursor into the new function and considers that part of his task done, and done well.
I mean, how could it be wrong? I need the same functionality in my cursor and by copy-pasting it I didn’t change a thing. However, the old code locked the records it selected by using a “for update” clause.
That was sensible in that code. Not so much in the new result cache function.

Still, although it obviously didn’t belong there, I would not expect that clause to cause issues.
Sure, I unnecessarily claim the records for myself for a (very short) period of time, and deadlock issues could potentially occur if an autonomous transaction later in the code tries to manipulate the same rows. But the claiming wasn’t a problem in this case and there was not such an autonomous transaction.

So, the fact that I didn’t notice my mistake shouldn’t cause problems.
… Hold my beer Let’s recreate the result cache function we used before. The only difference is the added “for update” clause in the query.

create or replace function ero_rcf
(p_id  in  ero_tbl.id%type
)
return ero_tbl.value%type
result_cache
is
  l_result  ero_tbl.value%type;
begin
  dbms_output.put_line ('====> RCF executed for id = '||p_id);

  select value
    into l_result
    from ero_tbl
   where id = p_id
  for update of value 
  ;
  return l_result;
end;
/

create or replace function ero_rcf

(p_id in ero_tbl.id%type

)

return ero_tbl.value%type

result_cache

l_result ero_tbl.value%type;

begin

dbms_output.put_line ('====> RCF executed for id = '||p_id);

select value

into l_result

from ero_tbl

where id = p_id

for update of value

;

return l_result;

end;

We’ll continue to work on the same table as before:

select *
  from ero_tbl
 order by id
;

select *

from ero_tbl

order by id

;

Let’s call the function multiple times for IDs 5 and 7, similarly to what we did before:

begin
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));

  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
end;
/

begin

dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));

dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));

end;

Exactly what we would expect: For each ID the function is executed once, and subsequent calls to the function return the value from the cache.

When we perform some DML on the underlying table…

update ero_tbl
   set value = 'UPDATED value of 7'
 where id    = 7
;

commit;

update ero_tbl

set value = 'UPDATED value of 7'

where id = 7

;

commit;

The data must be changed…..

select *
  from ero_tbl
 order by id
;

select *

from ero_tbl

order by id

;

Check!

And the result cache should be invalidated, causing the function to be executed again.

begin
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));

  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
end;
/

begin

dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));

dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));

end;

WAIT????
WHAT???

The function just continues to use the cached values ?

What happens if we do a radical change to the table:
Truncate it….

truncate table ero_tbl;

select *
  from ero_tbl
 order by id
;

truncate table ero_tbl;

select *

from ero_tbl

order by id

;

So now there’s absolutely nothing left in the table.
Surely now the cache is invalidated?

begin
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));
  dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));

  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
  dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));
end;
/

begin

dbms_output.put_line ('Value of id 5 = '||ero_rcf(5));

dbms_output.put_line ('Value of id 7 = '||ero_rcf(7));

end;

Nope!

Go Back to top

Conclusion

As you can see, if the result cache function contains a “for update” clause, the function gets executed once for each unique set of parameter values as it’s supposed to be, but no matter what you do, short of dropping the table, the cached values are not invalidated which leads to incorrect return values when the data in the table the function relies upon is changed.

So, be careful to not make a result cache function that contains a “for update” clause, because the function will work correctly but will return bad data. The damage it does depends on how long it takes until you notice errors in the data.

Go Back to top

Reacties

It’s so easy to pick the wrong one #JoelKallmanDay — Geen reacties

Geef een reactie Reactie annuleren

Deze site gebruikt Akismet om spam te verminderen. Bekijk hoe je reactie gegevens worden verwerkt.

HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <q cite=""> <s> <strike> <pre class="" title="" data-url="">

EvROCS

Completing the puzzle

It’s so easy to pick the wrong one #JoelKallmanDay

Go straight to

What we would expect

How to break the result cache

Conclusion

Reacties

It’s so easy to pick the wrong one #JoelKallmanDay — Geen reacties

Geef een reactie Reactie annuleren