Oft times when trying to make the data suitable for inclusion in the new applications, I ask my self "What were they thinking?"
I will leave the answer to that up to your imagination, as my answers to that particular question are not always complimentary.
One of the problems run into is when the data modeler and database designer chose to allow data with encoded special meanings.
In other words, there is key data in the database, the meaning of which can only be ascertained by asking someone that knows, or finding it in the documentation (Is there documentation?)
The example I will use is a real one I ran into.
Given a table containing projects, and revisions of those projects that are indicated as such by a revision "number".
A sensible series of revision numbers might be a numeric range beginning with 1, or even an alpha range beginning with 'A', or even a combination thereof.
Personally, I prefer the unambiguous nature of an ascending numeric range. 2 is always greater than 1. There is no case involved as there would be with an alpha range.
Which is greater, "a1" or "A1"? You know how this will sort in the database, but will it be what the users expect?
While a numeric range would have been preferable, the data I was given to work with used a range of revision "numbers" that was numeric and alpha, with a numeric value being regarding as greater than the "numeric" value. The "numeric" is in quotes, as this obviously must be stored as a string, and in this case with a leading zero.
Given this rule, a project with revisions of A,B,C,D,01,02 would have a most recent revision of "02". This is not the way it works in the standard collating order in the database.
11:34:03 SQL> create table rev (version varchar2(2));
11:34:03 SQL> insert into rev values('A');
11:34:03 SQL> insert into rev values('B');
11:34:03 SQL> insert into rev values('C');
11:34:03 SQL> insert into rev values('01');
11:34:03 SQL> insert into rev values('02');
11:34:03 SQL> select * from rev order by version;
5 rows selected.
From the data provided, only the most recent version was to be included in the imported data.
With a versioning system that doesn't follow the normal collating order, this requires some code to determine what is really the greatest version.
If you know a method to accomplish this in straight SQL, please feel free to post it. I could not think of a pure SQL solution.
The following generic function was created to determine the greatest revision value given a table name and the key columns.
-- return the maximum numeric revision
-- returna alpha if no numeric exists
-- maximum rev is numeric
-- revs are alpha - mixed number and character
create or replace function maxrev
v_max_rev integer := 0;
type curType is ref cursor;
function is_number( chk_data_in varchar2 )
dummy := to_number(chk_data_in);
when value_error then
when others then
-- get the maximum rev, whether alpha or numeric
-- there may not be any numerics
v_sql := 'select max(' || value_column_in || ') from ' || table_name_in || ' where ' || key_column_in || ' = :1';
execute immediate v_sql into v_return_rev using key_value_in;
v_sql := 'select ' || value_column_in || ' from ' || table_name_in || ' where ' || key_column_in || ' = :1';
open l_cursor for v_sql using key_value_in;
fetch l_cursor into v_col_rev;
exit when l_cursor%notfound;
if (is_number(v_col_rev)) then
v_tmp_rev := to_number(v_col_rev);
if (v_tmp_rev > v_max_rev) then
v_max_rev := v_tmp_rev;
v_return_rev := v_col_rev;
Let me know if you spot it. )
Here's a test case to prove that the function works as expected.
-- should always return numeric if it exists, otherwise alpha
drop table maxrev_test;
create table maxrev_test ( id varchar2(4), rev varchar2(2));
insert into maxrev_test values('T1', 'A');
insert into maxrev_test values('T1', 'B');
insert into maxrev_test values('T1', '01');
insert into maxrev_test values('T1', '02');
insert into maxrev_test values('T2', '01');
insert into maxrev_test values('T2', '02');
insert into maxrev_test values('T2', '03');
insert into maxrev_test values('T3', 'X');
insert into maxrev_test values('T3', 'Y');
insert into maxrev_test values('T3', 'Z');
select * from maxrev_test order by id,rev;
col rev format a10
prompt Incorrect results
select id, max(rev) rev
group by id
order by id
prompt Correct results
select id, maxrev('maxrev_test','id','rev',id) rev
group by id
order by id
And the results:
3 rows selected.
3 rows selected.