It’s so easy to pick the wrong one #JoelKallmanDay
I started this blog enthusiastically. And then life happened. Today is Joel Kallman Day, so it seems a good day to pick this up again.
Some time ago I ran into a problem with my code, where the cache for a result cache function wasn’t invalidated if DML was performed on the table the function selected from.
This led to the function returning old values because it was under the impression the cache was still valid.
I’ve checked and seen this behavior in all versions between 12.1.0.2 and the currently highest available version 23c FREE.
In the end it turned out to be caused by a “for update” clause that I unintentionally copied when I put the query into the function from someplace else.
So, it wouldn’t have happened if paid more attention to what I was copying.
Nevertheless, the combination of a result cache and a “for update” is as far as I know perfectly valid, although it implies some bad practice.
The behavior was (for me) unexpected, so it took me a couple of hours to find the cause, because I was concentrating on what I was doing wrong in the rest of the code.
Because of that and the fact that in rare occasions you might actually want the “for update” clause in the query of your function I decided to write this blog post as a small warning.
First, for those who might not be very familiar with the result cache, I’ll show how it is supposed to work.
Then I will show the case where it doesn’t work as expected for me.
Go straight to
What we would expect
How to break the result cache
Conclusion
What we would expect
After a resultcache function is executed its return value is cached, so if it gets called again with the exact same parameter values it is not executed, but rather the cached value is immediately returned. This can obviously have a huge performance benefit.
But what if the function returns a value that depends on the data in one or more tables?
Well, Oracle handles that pretty well. As long as nothing changes to the tables the function depends on the cache can be used (if a value for the combination of parameter values has been cached before, of course).
The moment a table that the function depends on, or its data, changes the cache is invalidated, meaning that for every unique combination of parameter values used, the function will be actually executed again once and the new return value will be cached in the fresh cache. Let’s create a table that we can use in our result cache function
1 2 3 4 5 6 7 |
create table ero_tbl as select level as id , 'Original Value of '||level as value from dual connect by level <= 10 ; |
The table has 10 rows, with IDs 1 through 10 and a column VALUE that contains ‘Original value’ for each row.
1 2 3 4 |
select * from ero_tbl order by id ; |
Now we need a result cache function that depends on this table.
I will create a simple function that accepts an ID as parameter and will return the value of the VALUE column for that ID.
And the function will put a message on screen so we can see if it’s been executed when called.
Yes, I know, the function should really be in a package, not standalone.
The function should at least handle the no_data_found exception.
Yadi-yadi-ya…… All that is not the point here, so for simplicity sake, this is the function we will work with.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
create or replace function ero_rcf (p_id in ero_tbl.id%type ) return ero_tbl.value%type result_cache is l_result ero_tbl.value%type; begin dbms_output.put_line ('====> RCF executed for id = '||p_id); select value into l_result from ero_tbl where id = p_id ; return l_result; end; / |
We’re all set.
Let’s call this function for IDs 1 and 3 each multiple times, and see what happens.
1 2 3 4 5 6 7 8 9 10 |
begin dbms_output.put_line ('Value of id 1 = '||ero_rcf(1)); dbms_output.put_line ('Value of id 1 = '||ero_rcf(1)); dbms_output.put_line ('Value of id 1 = '||ero_rcf(1)); dbms_output.put_line ('Value of id 3 = '||ero_rcf(3)); dbms_output.put_line ('Value of id 3 = '||ero_rcf(3)); dbms_output.put_line ('Value of id 3 = '||ero_rcf(3)); end; / |
As we can see from the output, the first time the function got called with parameter value 1 it was executed and it returned the VALUE of that ID.
Subsequent calls of the function with parameter value 1 did NOT execute the function, but the correct value was returned from the cache.
The same is true for parameter value 3. First time the function is called with *that* value it’s executed again. And that’s not just within that execution of that plsql block. If we run it again, we get
The function is not executed anymore at all, because the results are still in the cache.
The cached return values are available for any session until the cache gets invalidated or pushed out of the cache due to it filling up with newer cached values. But? But? What happens if a row in the table is changed?
Well, that means the cache no longer reflects the actual value(s) in the table that is used by the function.
Oracle is intelligent enough to detect the DML and invalidate any cache that depends on the effected table. This enforces that the function will be executed again the first time it is called with a certain set of parameter values.
1 2 3 4 5 6 |
update ero_tbl set value = 'UPDATED value of 3' where id = 3 ; commit; |
If we now run the exact same anonymous block as before, that calls the result cache function we see:
For ID = 1 the function returns the exact same value as before, because its data did not change.
But Oracle does not know exactly what the function does. It only knows the result of the function depends on the table, and the table is changed, so the complete result cache for the function is invalidated. This leads to the function being executed again once for ID = 1 and once for ID = 3.
Indeed we see that for ID = 3 a different value is returned after the update.
How to break the result cache
In this part the result cache function will be a little bit different.
As I said in the intro of this post a cursor that was created within other functionality needed to be moved to my result cache function.
The other functionality was to be removed and the function was to largely take over the functionality.
The cursor in the function, however, needed the same functionality as in the old code.
So what does a lazy developer do in such a case?
He copy-pastes the cursor into the new function and considers that part of his task done, and done well.
I mean, how could it be wrong? I need the same functionality in my cursor and by copy-pasting it I didn’t change a thing. However, the old code locked the records it selected by using a “for update” clause.
That was sensible in that code. Not so much in the new result cache function.
Still, although it obviously didn’t belong there, I would not expect that clause to cause issues.
Sure, I unnecessarily claim the records for myself for a (very short) period of time, and deadlock issues could potentially occur if an autonomous transaction later in the code tries to manipulate the same rows. But the claiming wasn’t a problem in this case and there was not such an autonomous transaction.
So, the fact that I didn’t notice my mistake shouldn’t cause problems.
… Hold my beer Let’s recreate the result cache function we used before. The only difference is the added “for update” clause in the query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
create or replace function ero_rcf (p_id in ero_tbl.id%type ) return ero_tbl.value%type result_cache is l_result ero_tbl.value%type; begin dbms_output.put_line ('====> RCF executed for id = '||p_id); select value into l_result from ero_tbl where id = p_id for update of value ; return l_result; end; / |
We’ll continue to work on the same table as before:
1 2 3 4 |
select * from ero_tbl order by id ; |
Let’s call the function multiple times for IDs 5 and 7, similarly to what we did before:
1 2 3 4 5 6 7 8 9 10 |
begin dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); end; / |
Exactly what we would expect: For each ID the function is executed once, and subsequent calls to the function return the value from the cache.
When we perform some DML on the underlying table…
1 2 3 4 5 6 |
update ero_tbl set value = 'UPDATED value of 7' where id = 7 ; commit; |
The data must be changed…..
1 2 3 4 |
select * from ero_tbl order by id ; |
Check!
And the result cache should be invalidated, causing the function to be executed again.
1 2 3 4 5 6 7 8 9 10 |
begin dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); end; / |
WAIT????
WHAT???
The function just continues to use the cached values ?
What happens if we do a radical change to the table:
Truncate it….
1 2 3 4 5 6 |
truncate table ero_tbl; select * from ero_tbl order by id ; |
So now there’s absolutely nothing left in the table.
Surely now the cache is invalidated?
1 2 3 4 5 6 7 8 9 10 |
begin dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 5 = '||ero_rcf(5)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); dbms_output.put_line ('Value of id 7 = '||ero_rcf(7)); end; / |
Nope!
Go Back to topConclusion
As you can see, if the result cache function contains a “for update” clause, the function gets executed once for each unique set of parameter values as it’s supposed to be, but no matter what you do, short of dropping the table, the cached values are not invalidated which leads to incorrect return values when the data in the table the function relies upon is changed.
So, be careful to not make a result cache function that contains a “for update” clause, because the function will work correctly but will return bad data. The damage it does depends on how long it takes until you notice errors in the data.
Reacties
It’s so easy to pick the wrong one #JoelKallmanDay — Geen reacties
HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">