Thursday, November 12, 2009

Data Modeling


Most readers of the blog are probably DBA's, or do DBA work along with development or other duties.

Though my title is DBA, Data Modeling is something I really like to do.

When first learning Oracle, I cut my teeth on data modeling, and used CASE 5.1 on unix to model a database system. True, CASE 5.0 used an Oracle Forms 3.x based interface, and the GUI modeling was unix only.

That was alright with me, as the Form interface allowed manual changes to be made quite quickly.

And the graphic modeling tool was fairly decent, even on a PC running Hummingbird X Server.

When Designer 2000 came out, it was clearly a more capable tool. Not only did it do everything that CASE 5.1 could do, it could do more. I won't make any silly claim that I was ever able to fully exploit D2K, as it was an end-to-end tool that could do much more than model data and databases.

What it could do with just the databases however was quite good.  Data models could be created, and then a physical database could be generated from the model.

Changes in the database model could be reverse engineered back to the model, and changes in the model could be forward engineered in to the physical model. D2K could truly separate logical and physical models, and allow changes to be migrated back and forth between the two.

There are other high end tools such as Erwin which can no doubt accomplish the same thing, but I have not used them.  

One important differentiation for me between D2K and other tools was that D2K worked with Barker Notation, which is the notation I first learned, and the one I still prefer.  

I should not speak of Designer 2000 in past tense I guess, as it is still available from Oracle as part of the Oracle Development Suite, but is now called Oracle Designer.  It just hasn't received much attention in the past few years, as I think many people have come to think of data modeling as too much overhead.  

I've tried several low end tools in the past few years, and while some claim to separate logical and physical models, those that I have tried actually do a rather poor job of it.

All this leads to some new (at least, new to me) developments from of all places, Microsoft.

Maybe you have heard of Oslo, Microsoft's Data Modeling toolset that has been in development for the past couple years.

If you're just now hearing about it, you will likely be hearing much more. The bit I have read has made me think this will be a very impressive tool.

If you have done data modeling, you have likely used traditional tools that allow you to define entities, drop them on a graphical model, and define relationships.

The tool you used may even have allowed you to create domains that could be used to provide data consistency among the entities.

Oslo is different.  

Oslo incorporates a data definition language M. The definitions can be translated to T-SQL, which in turn can be used to create the physical aspects of the model.  M also allows easy creation of strongly typed data types which are carried over into the model.

Whether Oslo will allow round trip engineering ala D2K, I don't yet know.

I do however think this is a very innovative approach to modeling data. 

Here are a few Oslo related links to peruse :


You may be thinking that I have given SQL Developer Data Modeler short shrift.

Along with a lot of other folks, I eagerly anticipated the arrival of SQL Developer Data Modeler.

And along with many others, was disappointed to learn that this add on to SQL Developer would set us back a cool $3000 US per seat.  That seems a pretty steep price for tool that is nowhere near as capable as Oracle Designer, which is included as part of the Oracle Internet Developer Suite. True the price is nearly double that of SQL Modeler at $5800, but you get quite a bit more than just Designer with the Suite.

As for the cost of Oslo, it's probably too early to tell.

Some reading suggests that it will be included as part of SQL Server 2008, but it's probably too soon to tell.

Why all the talk about a SQL Server specific tool?

Because data modeling has been in a rut for quite some time, and Microsoft seems to have broken out of that rut.  It's time for Oracle to take notice and provide better tools for modeling, rather than upholding the status quo.







Monday, November 09, 2009

MetaLink, we barely knew ye

But, we wish we had more time to get better acquainted.

If you work with Oracle, you probably know that MetaLink went the way of the Dodo as part of an upgrade to My Oracle Support during the weekend of November 6th, 2009.

And so far it hasn't gone too well, as evidenced by these threads on Oracle-L:

Issues with My Oracle Support
Metalink Fiasco

Many people were lamenting the loss of MetaLink well before its demise, but I don't think any were quite expecting the issues that are currently appearing.

A few have reported that it is working fine for them, but personally, I have found  it unusable all morning.

At least one issue with MetaLink appears to have been cleared up with MOS, that is while I was able to login to it last week.

During a routine audit of who had access to our CSI numbers, I came across a group of consultants that were no longer working for our company, and froze their accounts.  The next day I received a frantic voice mail  from a member of the consulting firm, and he informed me that they had no access to MetaLink because I had frozen their accounts.


I returned the call just a few minutes later, but they had already been able to resolve the issue, as one of their consultants with admin rights had been unaffected, and was able to unfreeze their accounts.


Removing them from the CSI is the better procedure, but in the past when I have attempted to do so, I found that there were still open issues owned by the accounts, and could not remove them. The application owners had been very clear that this access should be removed, so I froze the accounts, so that is what I did on this occasion as well.


This all seemed quite bizarre to me.  This must be a very strange schema in the ML user database, and some strange logic to go along with it.  By granting a user access to a CSI, MetaLink was giving me Carte Blanche to effectively remove them from MetaLink.

How has My Oracle Support fixed this?  Try as I might, I could not find a 'freeze' button in user administration in MOS.  So the fix seems to have been "remove the button"

Thursday, October 15, 2009

Oracle Open World 2009 Report - Part Two


Tuesday October 13th

Unconference on Indexes
Richard Foote
10/13/2009 10:00 AM

I started off the day attending the indexing presentation of fellow Oak Table member Richard Foote.  Foote has become quite well known for his expertise on index internals since the publication of Oracle B-Tree Index Internals: Rebuilding the Truth

This was basically a Q&A session, and I will include just a couple of the questions.

Q: Have you ever seen an index Skip Scan used correctly?
A: The short answer was 'No'

Foote mentioned that he had only rarely seen an index skip scan used, and then inappropriately.  For more information on skip scan, see Foote's blog entry on Index Skip Scans

Q: When can you safely drop an index that doesn't seem to be used?
A: That is very difficult to determine

The explanation for this answer is that it is very difficult to determine in an index is never used.  This could be for a variety of reasons.

* index monitoring does not detect when an index is used for SELECT
* the code that uses the index may be run only rarely, making it difficult to determine if it is actually used

Oracle Closed World

OCW actually started on Monday, though due to the wonders of technology I missed it on that day.  The event was invitation only, either by being present when it was mentioned, or by receiving an SMS text on your phone.

This is where technology comes in.  The SMS was rather garbled, and I received through a series of very short SMS messages what seemed to be an invitation to stroll into a dark alley somewhere in downtown San Francisco.  It was later cleared up and I attended on Tuesday.

Oracle Closed World is the brain child of Mogens Norgaard, another Oak Table member, and co-founder of Miracle AS Oracle consulting

On Tuesday Jonathan Lewis reprised his "How to be an Expert" presentation, the difference being that this audience was comprised of folks with a wide breadth of Oracle knowledge.

Lewis took advantage of this by making the questions harder, and chiding the audience for not knowing the answers.  All was in good fun. Undoubtedly the presence of beer didn't make the questions any easier to answer.

Wednesday was a presentation by Jeremiah Wilton, Oak Table member and formerly a DBA at Amazon.com.

Wilton presented a live demo on using Amazon's Elastic Compute Cloud (EC2) to provision a linux server, using Elastic Block Storage (EBS) to provide persistant storage, and preconfigured Amazon Machine Instances (AMI) to build provision the server with Oracle already installed.

The fact that Wilton was able to do this during a 1 hour live demo, complete with the inevitible mishaps that can occur during a live demo, and complete the task was quite impressive.

This appears to be a great method to setup test instances of Oracle for experimentation.  There are companies using this for production use as well.

 Amazon Web Services

Perl - A DBA's and Developers Best (forgotten) Friend
Arjen Visser - Avisit Solutions
10/13/2009 

Perl is a topic near and dear to my heart.

I have been using it since version 4 in the early 1990's, and have advocated it's use ever since.  It is a robust and powerful language with a huge collection of libraries developed by the user community and archived in the Comprehensive Perl Archive Network (URL HERE:  http://cpan.org/)

When I spotted the Perl session on the schedule I immediately signed up for it.

What I had not notice was the subtitle indicating it was a session for beginners.

No matter, I had to go.

The sesssion began with a concise but clear introduction to Perl basics.

So far, so good.

When the time came to discuss Perl connectivity to Oracle, it was a bit surprising to be confronted with a slide showing how to use Perl as a wrapper for sqlplus.

"Surely" I thought, "this is going to be a slide showing how not to do it"

If you have used Perl with Oracle, you are no doubt familiar with DBI  and DBD::Oracle

DBI is the Perl Database Interface module developed and maintained by Tim Bunce

DBD::Oracle is the Oracle driver for DBI, also originally developed and mainted by Tim Bunce, and now being maintained by The Pythian Group

DBI and DBD::Oracle are very mature and robust Perl packages for using Perl with Oracle.

You would also likely know that using Perl as a wrapper for sqlplus is something that is very cumbersome and inelegant. So as to not write whole treatise on why you shouldn't do this, I will simply say that doing so is rarely necessary, and never an optimal method.

Which brings us to the next slide in the presentation, which had a diagram showing the how DBI and DBD::Oracle fit into the Perl architecture.

The speaker then told the audience that these were hard to install and difficult to use, and didn't recommend using them.

After picking my jaw back up off the floor, I lost all interest in the rest of the presentation.  I don't remember what the rest of the slides were.  Maybe I blacked out from the shock. What I remember is walking away from the presentation rather incrudulous.

Just last week, a friend that had not used Perl asked my how to install it on a Solaris server.  With only a few lines of email that I typed from memory he was able to successfully install DBI and DBD::Oracle.

Hard to install indeed.

11 Things about 11gR2
Tom Kyte

Really it was Tom's top 10 list for 11gR2 - he liked his favorite feature so much he counted it twice.

And that is the one I will mention.

It is Edition Based Redefinition,

In a nutshell this feature allows you to create a new namespace for PL/SQL objects, creating new versions in a production database.

This will allow upgrading applications with little or no downtime, something that has always been on of the DBA holy grails.

Rather than try to explain it (OK, I don't yet know know it works) I will just tell you to take a look at Chapter 19 in the 11gR2 Advanced Application Developers Guide.

Wednesday Keynote
Larry Ellison

Ellison promised to discuss 4 topics, I will include 3 of them.

I left before the Fusion Middleware discussion.

Oracle enterprise linux update

One interesting fact presented was a survey performed by HP detailing Linux usage in corporate data centers.  The numbers are rather surprising.

* Oracle Enterprise Linux 65%
* Redhat 37%
* Suse 15%
* Other 2%


Next was the second generation of the Exadata Database Machine.

Essentially it is faster then gen 1.

It was used to set a new TPCC benchmark record - I believe it was 1,000,000 transactions per seond.

Ellison was proud of the record being 16 times faster than the record previously set by IBM, and rightfully so if those numbers are correct.

It seems IBM has challenged the results however, claiming the Exadata 2 as  'only 6 times faster'.  As you might imagine, Ellison had some fun with that, even offering a $10 million prize to anyone that can show that a Sun Exadata machine cannot run the app at least twice as fast as another other system.  IBM is invicted to participate.

At this time Ellison welcomed a special guest to the stage. Californie Governor  Arnold Schwarzenegger.

Commenting on being in a room with so many IT folks Schwarzenegger commented "As I came out on stage I felt my IQ shoot up 10 pts."

Schwarzenegger talked for a few minutes on the impact of technology on peoples lives. "Technologies impact is flesh and blood" in reference to how tech is used to aid response of public services such as firefighting.

Arnold called for a round of applause for Larry Ellison and Scott McNeely for being technology leaders.

The camera cut to Ellison, looking uncharacteristically humble as he mouthed 'Thank you'.

After Schwarzenegger left the stage, Ellison continued, this time discussing My Oracle Support.

My Oracle Support has been a hot topic lately, as the majority of DBA's are less than thrilled with the new Flash interface being imposed.  It is my understanding that a HTML version of the interface will be maintained, so we won't have to deal with Flash if we don't want to.

Here's where it gets interesting - the unification of Oracle Enterprise Manager and My Oracle Support.

There is now a 'My Oracle Support' tab in OEM.

DBAs will be allowed to opt in to OCM, Oracle Configuration Manager, allowing Oracle to perform automated discovery of bugs and patches needed, either in Oracle or other vendors on server (OS bugs)

Oracle will will then have a global database to mine for proactive response to possible problems.

When a configuration is found to have issues, all users with that configuration can be proactively notified.

The real news IMO though is the impact on patching.

Oracle recently started offering a new patch pacakge - PSU.

This is different than the CPU patch system, as it may require merge patches to resolve patch conflicts.

If OEM My Oracle Support determines that a merge patch is needed, it will automatically file an SR requesting the patch and notify you when it is available.

Even if you don't like OEM, this may be a good use of it.

Ok, that's enough for now, time for lunch.



Wednesday, October 14, 2009

Oracle Open World Report for October 11th and 12th

As I am attending Open World 2009 on blogger credentials, it seems proper I should  actually blog about it.

So, here it is.  I won't be blogging about keynotes or other things that will appear in the news the following day, but rather on some of the sessions I attend.

As I got back to my room too late and too tired to do this properly on Monday, I am putting Sunday and Monday in the same post.

Here goes:


Open World - Sunday 10/11/2009

While attending Oracle Open 2009, I thought it a good idea to make some report of sessions attended, and any interesting developments at OOW.

Some of the sessions I attended may not be considered DBA topics. I thought it would be interesting to break out of the DBA mold for a bit and attend some sessions that might be a bit outside the DBA realm.

Sue Harper - Everyday Tasks with Oracle SQL Developer

Sue Harper is the product manager for SQL Developer, and was presenting some of the useful new features of the SQL Developer 2.1 Early Adopter release.

While I have used SQL Developer from the time it was first released as Raptor, I have not until recently used it simply as a database browsing tool.  After exploring some of the features that allow writing reports with master/detail sections, I converted some SQLPLus scripts for use with SQL Developer.

SQL Developer is a very capable tool, so I attended this session to see what else I might be missing out on.

There was only one hour allocated for the session, and given the first 15 minutes were consumed convincing the audience why they should be using SQL Developer, there was just that much less time available to see the new features.

Taking a few minutes to market it is probably just in the product manager DNA.

Some of the features demonstrated were actually available in 1.5, but maybe not widely known.  As I have not used 2.1, I won't always differentiate between versions here. Some of these features may not be new to 2.1, maybe just improved.

Though not a new feature in 2.1, a few minutes were used to demonstrate the use of the built in version control integration. This is a very useful feature, and can be setup for seamless integration for CVS, SubVersion, Perforce, and one other I can't recall now.  It's definitely worth a look.

Some features that are new to 2.1 looked very useful:

Persistent column organizing and hiding.  When viewing data in SQL Developer, the columns may be easily rearranged and selected or de-selected for viewing.  While previous versions allowed dragging columns around, 2.1 has a nice dialog that makes this much easier.

New to 2.1 is column filtering.  By right clicking on a cell in the data pane, a dialog can be brought up to filter the data based on values found.  This allows filtering the data without requerying the table.

Also new to 2.1 is an XML DB Repository Navigator. It was mentioned, but alas there was not time to demonstrate it.

http://www.oracle.com/technology/products/database/sql_developer/index.html

http://sqldeveloper.oracle.com/


Christoper Jones - Python/Django Framework

This was a hands on developer session centered on using the Python scripting language with the Django Web application framework.  This was a fun session.  The lab was already setup, running Oracle Linux VM's with access via individual laptops setup in the training room.

The lab was a go at your own pace session, with instructions both printed and available via browser.  Browser based was the way to go with the instructions, as the examples could be cut and pasted, saving a lot of time typing.

I wasn't able to quite complete the lab as I kept encountering an error when running the web app.  It was probably due to an error in one of the scripts I modified during the session, but enough was accomplished to see that the Django Framework looked very interesting.  Perhaps even simple enough to use for a DBA.  Yes, I did search the schedule for a similar Perl session, perhaps using Mason or somesuch.

The training materials are going to be placed on OTN in the Oracle By Example section after Open World concludes.

http://www.oracle.com/technology/obe/start/index.html

Ray Smith - Linux Shell Scripting Craftmanship

The last session I attended on Sunday was all about shell script craftsmanship. Ray Smith was sharing some common sense methods that can be used to greatly enhance your shell scripts.

If you have done any software development, the information presented would be similar to what you already know.

  • Use white space and format your code for readability.
  • Don't be overly clever - other people will have to read the shell script.
  • Format your scripts with a header explaining the purpose of the script, and separate sections for independent and dependent variables, and a section for the code.
  • Use getops to control command line arguments.


Smith strongly advocated that everyone in the audience obtain a copy of the book "The Art of Unix Programming" by Eric S. Raymond.  This is not a new book by any means, but Smith drew upon it for many of the principles he advocated in scripting.



A couple of tools new to me were mentioned:

Zenity and Dialog - both of these are graphical dialog boxes that may be called from shell scripts in linux.

http://freshmeat.net/projects/zenity

Dialog is installed with linux, so just do man dialog to check it out.

It was an interesting presentation.  Though a lot of it was not new to me, the two dialog tools mentioned were, showing that no matter how well you think you may know a subject, you can always learn something from someone else.


Open World - Monday 10/12/2009


Jonathan Lewis Unconference - How to be an Expert
http://www.jlcomp.demon.co.uk/
http://jonathanlewis.wordpress.com/

Jonathan Lewis had an interesting unconference presentation.

http://wiki.oracle.com/page/Oracle+OpenWorld+Unconference

In a nutshell, it comes down to this:

You must practice, and practice quite a lot.

To make the point, he used the joke about the American Tourist asking the grounds keeper how the lawns of the Royal Estates are maintained to be so lush, have such and even texture and in short, to be so perfect.

The groundskeeper explained while the tourist took notes.

First you must dig down 4 inches.

Then you must put down a layer of charcoal.

Then another 1 inch layer find sharp sand.

Finally a layer of fine loam goes on top.

You then must seed the lawn, and water it very well for 6 weeks.

After 6 weeks, you must cut the grass, being very carefully to remove only a small amount as you mow.  This must be done three times a week.

And then you continue doing this for 200 years.

Ok, everyone had a good laugh at that, but the point was made.

Reading some books and being able to run some scripts does not make you an expert.  Lots and lots of practice may make you an expert, if you apply yourself well.

During the presentation he asked a number of questions of the audience made up mostly of DBA's. I will reprise a couple of them here.

Q1:  Assuming you have a simple heap table, with no indexes, you update a single column in 1 row of the table.  How many bytes of redo will that generate?

Q2: Who of you in the audience when you insert data into a table, deliberately insert duplicate data into the database?

I will leave you to speculate on the answers a bit.

Of those 2 questions, only 1 was answered correctly by the audience.

Leng Tan and Tom Kyte DBA 2.0 - Battle of the DBA's

What is the difference between a DBA 1.0 (the old days) and a DBA 2.0 ( the modern DBA)

DBA 2.0 has modern tools, self managing database enabled by AWR and the Diag and Tuning packs.

DBA 1.0 uses scripts and works from the command line.

One the stage in addition to Kyte and Tan were two DBA's, each with a laptop and an oracle server to work on.

Two scenarios were presented for a timed hands on problem that each DBA must work through.

First scenario - Security Audit

Each DBA is given 6 minutes to do a database audit and report on possible vulnerabilities

DBA 1.0 ran scripts to check for open accounts, default passwords, publicly granted packages and umask settings.

After doing so he ran a script to remove privileges granted to PUBLIC, and locked a couple of accounts.

DBA 2.0

DBA 2.0 worked from the Oracle Enterprise Manager console, using the Secure
Configuration for Oracle Database.

He was able to observe the database security score, navigate through several screens and correct the same security problems that DBA 1.0 did.  Following that he was able to see that the security score for the database had improved.

So the conclusion made by the presenter is that OEM is clearly superior because OEM will automatically generate the needed data every night.

By contrast DBA 1.0 can only do one db at a time.

I do not believe this demonstration to be a valid comparison - it's quite simple to run the script against any number of databases from a script, and report on anomalies.

At this point it should be mentioned that DBA 1.0 spent 4 minutes explaining what he was going to do, another minute explaining what the scripts were doing, with less than 1 minute spent actually running the scripts.

By comparison, DBA 2.0 was navigating through screens through nearly the entire 6 minutes.

The statement was made by the presented that doing this with scripts at the command line was far too tedious a task, and DBA 1.0 would never be able to accomplish the task for 200 databases.

I won't belabor the point (well, not too much) but automating these kinds of tasks is relatively simple for command line tools.  Which is easier and more productive?  Automating a set of scripts to poll all of your databases, or navigate through OEM for 200 databases?

The present referred to using OEM as "really convenient"  Sorry, but I have never found OEM to be at all convenient.  Whenever I run into problems with it, it requires a SR to fix it.

Thetre was a round 2 as well regarding testing execution plans both before and after setting optimizer_features _enable to a newer version.  OEM fared well here compared the the scripting method as the scripts used 'explain plan' and OEM actually executed the queries to gather execution plan information.

That isn't to say however that the scripts could not be modified to do the same.  No, I am not completely against GUI environments.  I am just against making more work for DBA tasks.

Enough for now, I will report on Tuesdays conferences later this week.

Sunday, September 06, 2009

Shell Tricks

DBAs from time to time must write shell scripts. If your environment is strictly Windows based, this article may hold little interest for you.

Many DBAs however rely on shell scripting to manage databases. Even if you use OEM for many tasks, you likely use shell scripts to manage some aspects of DBA work.

Lately I have been writing a number of scripts to manage database statistics - gathering, deleting, and importing exporting both to and from statistics tables exp files.

Years ago I started using the shell builtin getopts to gather arguments from the command line. A typical use might look like the following:

while getopts d:u:s:T:t:n: arg
do
case $arg in
d) DATABASE=$OPTARG
echo DATABASE: $DATABASE;;
u) USERNAME=$OPTARG
echo USERNAME: $USERNAME;;
s) SCHEMA=$OPTARG
echo SCHEMA: $SCHEMA;;
T) TYPE=$OPTARG
echo TYPE: $TYPE;;
t) TABLE_NAME=$OPTARG;;
#echo TABLE_NAME: $TABLE_NAME
n) OWNER=$OPTARG
echo OWNER: $OWNER;;
*) echo "invalid argument specified"; usage;exit 1;
esac

done


In this example, the valid arguments are -d, -u, -s, -T, -t and -n. All of these arguments require a value.

The command line arguments might look like this:
somescript.sh -d orcl -u system -s scott

If an invalid argument such as -z is passed, the script will exit with the exit code set to 1.

For the script to work correctly, some checking of the arguments passed to the script must be done.

For this script, the rules are as follows:
  • -d and -u must always be set
  • -s must be set if -T is 'SCHEMA'
  • -t and -n must both have a value or be blank
  • -s must be used with -T
In this example, values for -T other than 'SCHEMA' are not being checked.

The usual method (at least for me) to test the validity of command line arguments has always been to use the test, or [] operator with combinations of arguments.

For the command line arguments just discussed, the tests might look like the following:

[ -z "$DATABASE" -o -z "$USERNAME" ] && {
echo Database or Username is blank
exit 2
}

# include schema name if necessary
[ "$TYPE" == 'SCHEMA' -a -z "$SCHEMA" ] && {
echo Please include schema name
exit 3
}

# both owner and tablename must have a value, or both be blank
[ \( -z "$TABLE_NAME" -a -n "$OWNER" \) -o \( -n "$TABLE_NAME" -a -z "$OWNER" \) ] && {
echo Please specify both owner and tablename
echo or leave both blank
exit 4
}

# if -s is set, so must -T
[ -n "$SCHEMA" -a -z "$TYPE" ] && {
echo Please include a type with -T
exit 5
}


As you can see, there are a fair number of tests involved to determine the validity of the command line arguments. You may have guessed why I skipped one for this demo - I just did not want to write any more tests.

Validating command line arguments really gets difficult with a larger number of possible arguments. Worse yet, any later modifications to the script that require a new command line argument become dreaded tasks that are put off as long as possible due the complexity of testing the validity of command line arguments.

While writing a script that had 11 possible arguments, I was dreading writing the command line argument validation section, I thought there must be a better way.

It seemed that there must be a simple method of using regular expressions to validate combinations of command line arguments. I had never seen this done, and after spending a fair bit of time googling the topic it became apparent that there was not any code available for a cut and paste solution, so it seemed a nice opportunity to be innovative.

After experimenting a bit, I found what I think is a better way.

The method I use is to concatenate all possible command line arguments into a ':' delimited string, and then use a set of pre-prepared regexes to determine whether or not the command line arguments are valid.

One immediately obvious drawback to this method is that arguments containing the ':' character cannot be used. However the delimiting character can easily be changed if needed.

Using the same example as previous, the command line arguments are all concatenated into a string and converted to upper case:

ALLARGS=":$USERNAME:$DATABASE:$OWNER:$TABLE_NAME:$SCHEMA:$TYPE:"
# upper case arges
ALLARGS=$(echo $ALLARGS | tr "[a-z]" "[A-Z]")


Next a series of regular expressions are created. The first two are generic, and may or may not be used as building blocks for other regular expressions. The others all correspond to a specific command line argument


# alphanumeric only, at least 1 character
export ALNUM1="[[:alnum:]]+"
# alphanumeric only, at least 3 characters
export ALNUM3="[[:alnum:]]{3,}"
# username - alphanumeric only at least 3 characters
export USER_RE=$ALNUM3
# database - alphanumeric only at least 3 characters
export DATABASE_RE=$ALNUM3
# owner - alphanumeric and _ and $ characters
export OWNER_RE='[[:alnum:]_$]+'
# table_name - alphanumeric and _, # and $ characters
export TABLE_RE='[[:alnum:]_#$]+'
# schema - alphanumeric and _ and $ characters
export SCHEMA_RE='[[:alnum:]_$]+'


These regular expressions could use further refinement (such as username must start with alpha only ) but are sufficient for this demonstration.

Next, the regular expressions are concatenated together into ':' delimited strings, with each possible command line argument represented either by its corresponding regex, or by null.

The regexes are stuffed into a bash array. For our example, it looks like this:
#   :   user        :  db           :  owner        :  table     : schema        : type
VALID_ARGS=(
":$USER_RE:$DATABASE_RE:$OWNER_RE:$TABLE_RE::(DICTIONARY_STATS|SYSTEM_STATS|FIXED_OBJECTS_STATS):" \
":$USER_RE:$DATABASE_RE::::(DICTIONARY_STATS|SYSTEM_STATS|FIXED_OBJECTS_STATS):" \
":$USER_RE:$DATABASE_RE:$OWNER_RE:$TABLE_RE:$SCHEMA_RE:(SCHEMA):" \
":$USER_RE:$DATABASE_RE:::$SCHEMA_RE:SCHEMA:")

Notice that there are four different combitations of command line arguments represented.

In all cases the USERNAME and DATABASE are required and must correspond to the regex provided.

In the first combination of arguments, the owner and table must also be specified, and type (-T) must be either one of DICTIONARY_STATS, SYSTEM_STATS or FIXED_OBJECTS_STATS.

In the second possible combination, the only argument allowed in addition to DATABASE and USERNAME is the type (-T) argument.

The third combination requires the OWNER, TABLE_NAME and SCHEMA argument to have a valid value, and the TYPE argument must be set to SCHEMA.

The final combination of arguments requires just the SCHEMA argument and the TYPE argument must be set to SCHEMA, in addition to the USERNAME and DATABASE arguments.

By now you likely want to know just how these regular expressions are tested. The following function is used to test the command line arguments against each regular expression:
function validate_args {
typeset arglist
arglist=$1

while shift
do
[ -z "$1" ] && break
if [ $(echo $arglist | grep -E $1 ) ]; then
return 0
fi

done
return 1

}


Here's how it is used in the script:

# VALID_ARGS must NOT be quoted or it will appear as a single arg in the function
validate_args $ALLARGS ${VALID_ARGS[*]}


While this method may appear somewhat confusing at first, it becomes less so after using it a few times. It greatly simplifies the use of many command line arguments that may appear in differing combinations.

As far as I know, this method only works properly with the bash shell. I have done testing on only two shells, bash and ksh. It does not work properly on ksh.

Here's a demonstration of the ksh problem. The following script is run from both ksh and bash:

function va {

echo ARG1: $1
}


R1="[[:alnum]]+"
R2="[[:alnum]]{3,}"

va $R1
va $R2
And here are the results:
18:9-jkstill-18 > ksh t3
ARG1: [[:alnum]]+
ARG1: [[:alnum]]3
[ /home/jkstill/bin ]

jkstill-18 > bash t3
ARG1: [[:alnum]]+
ARG1: [[:alnum]]{3,}
[ /home/jkstill/bin ]



Notice that when the script is run with ksh, the '{', '}' and ',' are removed from the regular expression. I could find no combination of quoting and escape characters that could prevent that from happening. This method of command line argument validation could be made to work using ksh if those characters are not used in the regexes. That would be rather limiting though.

One other drawback you may have noticed with this method of validating command line arguments is that when an error condition is encountered, the exit code is always 1. With the [] method it was easy to exit with different codes to indicate the nature of the error. Something similar could likely be done by embedding a code into each set of regexes, but I will leave that as an exercise for the reader.

The complete prototype script, as well as a test script can be downloaded:


The next article will include a set of functions used along with the validate_args() function to make shell scripts a bit more robust.

Thursday, August 06, 2009

Detecting Corrupt Data Blocks

Or more accurately, how not to detect corrupt data blocks.

This thread on Oracle-L is regarding lost writes on a database.

One suggestion was made to use the exp utility to export the database, thereby determining if there are corrupt blocks in the database due to disk failure. I didn't give it much thought at first, but fellow Oak Table member Mark Farnham got me thinking about it.

Using exp to detect corrupt blocks, or rather, the absence of corrupt blocks may work, but then again, it may not. It is entirely possible to due full table scan on a table successfully, as would happen during an export, even though the blocks on disk have been corrupted.

This can be demonstrated by building a table, ensuring the contents are cached, then destroying the data in the data file, followed by a successful export of the table.

Granted, there are a lot of mitigating factors that could be taken into consideration as to whether or not this would happen in a production database. That's not the point: the point is that it could happen, so exp is not a reliable indicator of the state of the data files on disk.

This test was performed on Oracle 10.2.0.4 EE on RH Linux ES 4. Both are 32 bit.

First create a test tablespace:

create tablespace lost_write datafile '/u01/oradata/dv11/lost_write.dbf' size 1m
extent management local
uniform size 64k
/



Next the table LOST_WRITE is created in the tablespace of the same name. This will be used to test the assertion that a successful export of the table can be done even though the data on disk is corrupt.

create table lost_write
cache
tablespace lost_write
as
select * from dba_objects
where rownum <= 1000
/

begin
dbms_stats.gather_table_stats(user,'LOST_WRITE');
end;
/

select tablespace_name, blocks, bytes
from user_segments
where segment_name = 'LOST_WRITE'
/


TABLESPACE_NAME BLOCKS BYTES
------------------------------ ---------- ----------
LOST_WRITE 16 131072

1 row selected.



Next, do a full table scan and verify that the blocks are cached:

select * from lost_write;

Verify in cache:
select file#,block#,class#, status
from v$bh where ts# = (select ts# from sys.ts$ where name = 'LOST_WRITE')
order by block#
/

FILE# BLOCK# CLASS# STATUS
---------- ---------- ---------- -------
40 2 13 xcur
40 3 12 xcur
40 9 8 xcur
40 10 9 xcur
40 11 4 xcur
40 12 1 xcur
40 13 1 xcur
40 14 1 xcur
40 15 1 xcur
40 16 1 xcur
40 17 1 xcur
40 18 1 xcur
40 19 1 xcur
40 20 1 xcur
40 21 1 xcur
40 22 1 xcur
40 23 1 xcur




Now swap the bytes in the file, skipping the first 2 oracle blocks
Caveat: I don't know if that was the correct # of blocks, and I didn't spend any time trying to find out
Also, I belatedly saw that count probably should have been 22 rather than 16, but the results still served the purpose of corrupting the datafile, as we shall see in a bit.

What this dd command is doing is using the same file for both input and output, and rewriting blocks 3-18, swapping each pair of bytes.

dd if=/u01/oradata/dv11/lost_write.dbf of=/u01/oradata/dv11/lost_write.dbf bs=8129 skip=2 count=16 conv=swab,notrunc



The effect is demonstrated by this simple test:

jkstill-19 > echo hello | dd
hello
0+1 records in
0+1 records out
[ /home/jkstill ]

jkstill-19 > echo hello | dd conv=swab
ehll
o0+1 records in
0+1 records out


Now we can attempt the export:

exp tables=\(jkstill.lost_write\) ...

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, Oracle Label Security, Data Mining and Real Application Testing options
Export done in WE8MSWIN1252 character set and AL16UTF16 NCHAR character set

About to export specified tables via Conventional Path ...
. . exporting table LOST_WRITE 1000 rows exported
Export terminated successfully without warnings.
19 > echo hello | dd
hello
0+1 records in
0+1 records out
[ /home/jkstill ]

jkstill-19 > echo hello | dd conv=swab
ehll
o0+1 records in
0+1 records out


So, even though the data on disk has been corrupted, the export succeeded. That is due to the table being created with the CACHE option, and all the blocks being cached at the time of export. It may not be necessary to use the CACHE option, but I used it to ensure the test would succeed.

Now let's see what happens when trying to scan the table again. First the NOCACHE option will be set on the table, then a checkpoint.

10:42:45 dv11 SQL> alter table lost_write nocache;

10:43:02 dv11 SQL> alter system checkpoint;

Now try to scan the table again:

10:43:14 ordevdb01.radisys.com - js001292@dv11 SQL> /
select * from lost_write
*
ERROR at line 1:
ORA-00376: file 40 cannot be read at this time
ORA-01110: data file 40: '/u01/oradata/dv11/lost_write.dbf'



A corollary conclusion can drawn from this example.

If you do discover bad data blocks, you just might be able to do an export of the tables that are in the affected region before doing any recovery. This might be a belt and suspenders approach, but DBA's are not generally being known for taking unnecessary chances when possible data loss is on the line.

Thursday, July 09, 2009

Classic MetaLink vs. My Oracle Support

If you are in any way involved with supporting Oracle products, then you know that the death knell for Classic MetaLink has sounded. MetaLink will be unplugged at the end of July 2009.

The new support site, My Oracle Support, seems to be causing some pain for quite a few people in the Oracle user community.

Some of the complaints regard limited platform support due to the Adobe Flash 9 requirements, navigation and response times.

On the other hand there are some cool new features such as Power View, Configuration Manager and the new Advanced Search options.

How do you feel about it?

Here's a chance to let your opinion be know as a poll has been created where you can vote on it.


At this time 637 voters have voiced their opinion about MetaLink and My Oracle Support.

Current Poll results can be found in this Excel File: MetaLink vs My Oracle Support Results