Forum OpenACS Q&A: ANN: acs-kernel port to MS-SQL Server

I've written some code to machine-translate OpenACS plpgsql into MSSQL Server 2000 compliant transact sql, and have completed initial work on porting the acs-kernel sql files.

I wrote up a white paper the explains the what, why and how. In brief, I wanted to see if I could possibly re-use the kernel data model on a client project. In addition, I wanted to see how much leverage I could get out of automated machine translation. With the exception of some cursors, triggers and dynamic SQL statements, the answer turned out to be quite a lot.

The source code to the translation code needs to be cleaned up a little bit before release, but I've produced several files for people to examine what I've done in the interim:

  • An HTML-ized side-by-side comparison of pl/pgsql and translated tsql (includes error messages). [link] (480K)
  • Release notes [link]
  • A single sql file containing all converted SQL [link](170K)

I'm not likely to have time to do much with this project in the near future, but please do contact me if what I've done could be of use or would a link to the source code when it's published. (I'll post it here, but just in case)

Collapse
Posted by Lars Pind on
John,

This looks really cool. Thanks for doing this.

Though it's not quite finished yet, it's good to know that there's a path, if someone in the community gets a project where MSSQL is a requirement.

Collapse
Posted by Jun Yamog on
Wow! such a great accomplishment.  Also it does open the window to use OpenACS on another platform.... and that platform is in the other extreem SQL server.  Most of the time clients that have Oracle has a *nix box where postgres can be installed.  People who have chosen MS SQL seem to be more of a MS shop or those who can't afford Oracle.  This effort should be joined by other people who has the same problem or interest.
Collapse
Posted by leigh silvester on
I wish I had found this a couple of months ago!!

Have just spent a load of time figuring out plpgsql and converting this to run on MS SQL.

Oh well, it has been intersting to look at your code and see how it compares to what I have com up with.

Have you progressed any further with this?

Collapse
Posted by John Sequeira on
Leigh,

No -- I haven't progressed on it further and don't expect to.

With any kind of automated translation project like this you hit the point of diminishing returns, and after I taught myself the basics of parse tree rewriting and got the acs-core done, I hit that point. Anything beyond that would just have created a maintenance job (keeping pg+mssql+oracle in sync) that neither I nor anyone else really wanted.

If I were looking to port a great deal of the acs kernel code (or beyond) in the future, I might look to some of the vendor tools for doing this. Microsoft now has a free Oracle Migration toolkit, for example, (http://www.microsoft.com/sql/solutions/ssm/ssmav2.mspx) and other commercial ones exist. You might try begging a license from them for use in open source projects.

John

Collapse
Posted by Nima Mazloumi on
wouldnt it be possible to provide xql files for mssql and commit that to cvs?
yes, that's possible.
Plus you'll need to provide /sql/mssql/ package create files as well.
Collapse
Posted by John Sequeira on
Nima,

In theory, you're correct - all you have to do is add some XQL files to CVS to support a new db.

In practice, however, there are more than a few lines of database-specific code in OpenACS that would have to be refactored. You would also have to update docs to deal with installing the proper db driver (odbc or native). Then of course after you did all this (mucking with code that 100% of the community relies on to service that 0.001 %) you would have to commit to maintaining the code+docs+tests going forward. And even then you would not have started to port the huge library of package-specific sql code or address coordinating this with package maintainers.

If a project of this magnitude were to be taken on (which is highly unlikely), it would be better to support the other 99.5% of web servers (with something like portable.nsd) rather than a database with ~30% market share which very few folks with OpenACS expertise care about.

Collapse
Posted by leigh silvester on
I have been working on extending this code.
Part of what I have been doing is reworking it so that it can be run with no errors or warnings.
So far I have got about half way through where I am hitting errors that are due to missing functions/procedures.

Some of the missing components so far are party__new and acs_object__delete.

For party__new I have fabricated the following:
CREATE function dbo.party__new
@new__person_id integer,
@new__object_type varchar(8000),
@new__creation_date varchar(50),
@new__creation_user integer,
@new__creation_ip varchar(8000),
@new__email varchar(8000),
@new__url varchar(8000),
@new__context_id integer,
@ms_return_value integer = 0 OUTPUT

AS
BEGIN
declare @retval int
EXECE dbo.obs_object__new @new__person_id,@new__object_type,@new__creation_date,@new__creation_user,@new__creation_ip,@new__email,@new__url,@new__context_id,@ms_return_value = @retval OUTPUT
SET @ms_return_value = @retval
INSERT INTO parties (party_id, email, url)
VALUES (@ms_return_value, lower(@new__email), @new__url)
RETURN @ms_return_value
END

acs_object__delete is proving to be a bit more challenging.
The postgres version uses a proprietary function, viz: tree_right.

The section which is challenging to me is:
for obj_type
in select o2.table_name, o2.id_column
from acs_object_types o1, acs_object_types o2
where o1.object_type = (select object_type
from acs_objects o
where o.object_id = delete__object_id)
and o1.tree_sortkey between o2.tree_sortkey and tree_right(o2.tree_sortkey)
order by o2.tree_sortkey desc
loop
-- Delete from the table.

-- DRB: I removed the quote_ident calls that DanW originally included
-- because the table names appear to be stored in upper case. Quoting
-- causes them to not match the actual lower or potentially mixed-case
-- table names. We will just forbid squirrely names that include quotes.
-- daveB
-- ETP is creating a new object, but not a table, although it does specify a
-- table name, so we need to check if the table exists. Wp-slim does this too

if table_exists(obj_type.table_name) then
execute ''delete from '' || obj_type.table_name ||
'' where '' || obj_type.id_column || '' = '' || delete__object_id;
end if;
end loop;

For MS SQL I have so far interpreted this as...
DECLARE @object_type varchar(8000)
DECLARE @object_name varchar(8000)
DECLARE @object_id integer

-- FOR obj_type
DECLARE cursor_1 CURSOR FOR
SELECT o2.table_name,o2.id_column
FROM dbo.obs_object_types o1, dbo.obs_object_types o2
WHERE o1.object_type = (
SELECT object_type
FROM dbo.obs_objects o
WHERE o.object_id = @delete__object_id
)
AND o1.tree_sortkey BETWEEN o2.tree_sortkey AND tree_right(o2.tree_sortkey)
-- ORDER BY o2.tree_sortkey DESC
OPEN cursor_1
FETCH NEXT FROM cursor_1 INTO @object_type
WHILE (@@FETCH_STATUS = 0)
BEGIN
-- loop
-- DELETE FROM dbo.the table.
-- table name, so we need to check if the table exists. Wp-slim does this too
IF Exists (SELECT id FROM dbo.sysobjects WHERE name = @object_type)
BEGIN
--IF table_exists(obj_type.table_name) THEN
DELETE FROM dbo.obs_object_types WHERE id_column = @delete__object_id
END IF
FETCH NEXT FROM cursor_1 INTO @object_type
END -- while
CLOSE cursor_1
DEALLOCATE cursor_1

This is not perfect and it has the tree_right right function which of course is invalid. Is this really a right join?

I would be happy to post developments from John's original code as it evolves should John be amenable.

I am interested in a few things such as:
Is there any particular reason why this code dispenses with the use of triggers?
Perhaps they were not in use by the version of the code that was in existence when the port was done?
Why are email values and urls etc allowed to be so masssive?
In most applications I develop I usually allow up to 200 characters.

Collapse
Posted by Don Baccus on
The tree_right and BETWEEN operator are used to select subtrees of the object hierarchy. The equivalent in Oracle is "CONNECT BY" and "START WITH". It's not at all like a right join.
Collapse
Posted by John Sequeira on
Leigh,

I'm very happy to help where I can even though it's not an active project for me. I delighted that you're finding the work valuable. The OpenACS data model deserves to be stolen by as many other projects (and databases) as possible.

Just so you know, I did not hand-write very much of the ms-sql code - I wrote a very ugly compiler that parsed postgres and emitted transact sql. That explains a lot of the weirdness that you'll see. I think I made all varchars varchar(8000) ... I think this was to simplify a type mismatch between postgres and mssql but I don't remember specifics (i.e. 8000 handled some corner cases that my translator didn't or couldn't).

I also didn't bothere to write a parser/tokenizer for trigger syntax, as there were so few triggers it was easier to do by hand.

Speaking of triggers, for the tree stuff, in the codebase I translated (4.1?), openacs used to use triggers to maintain a parent-child lookup table to allow easy hierarchical queries (delete all children of this object). I think that delete object was rewritten when Postgres caught up with Oracle and implemented hierarchical query support in plpgsql.

with MSSQL 2005, SQL Server now has similar native support for tree queries... is that what you're targetting? You could either use that, or try to use the old triggers to maintain the parent-child table of yore.

(here's a writeup of the technique)

http://www.sqlteam.com/item.asp?ItemID=8866

I'm curious, what's your motivation for doing this?

John

Collapse
Posted by leigh silvester on
native support for tree queries
Useful tip.

I'm curious, what's your motivation for doing this?

Well you kinda hinted at it when you wrote...
"The OpenACS data model deserves to be stolen by as many other projects (and databases) as possible".

I am developing a new version of a system I have been working on elsewhere which needs a sophisticated object management system at it's core, OpenACS to me looks as if it has that.
It is more the core I am interested - objects, groups, parties and persons management. The APM management also has intersting features which could make maintaing this system and its various add on features more manageable.

A previous version of the system relied on LDAP directory to manage users, organisation, hierarchy and permissions. This might not always be feasible so I have turned to Open ACS to guide me. The original intention was to get to grips with Open ACS and make my own lite version, but the more I got to understand how it was working (although I wouldn't claim to fully understand it) the more I saw it would be silly to strip away various layers.
Does this make sense?

Collapse
Posted by leigh silvester on
Unfortunately "CONNECT BY" is not used by MS SQL.
I have just downloaded the " SwisSQL - Oracle to Sybase Migration Tool" which purports to be able to convert procedures which make use of "CONNECT BY".

Shall see what this can produce.

Collapse
Posted by John Sequeira on
No 'CONNECT BY' in MS-SQL - that would be too easy(!)

I think the construct you want is called a common table expression (CTE)

http://www.theserverside.net/discussions/thread.tss?thread_id=30972

I think it's different enough that an automated tool would have some issues... but you might get lucky.

And on that note,  you should check out Microsoft's free Oracle -> MS-SQL migration tool:

http://www.microsoft.com/sql/solutions/ssm/ssmav2.mspx

They recently purchased it from some specialist firm,  so I expect it works well.

Thanks again for another hot tip.
As this requires connection to an Orcale database to perform migration I am currently installing Oracle lite.

One thing has caused me a little confusion is that I am working from SQL for the latest version, whereas you performed migration on an older version.

Consequently the procedures that I am struggling to recreate in MS SQL do not exist in the Oracle version (seem to be present as package components operating in a different way), but are present in the Postgres SQL.
All very confusing.

Shall be interested to see how this migration tool copes with the oracle packages and associated methods.

I know this is probably the wrong place to say it, and I know it is because I am a simple soul who is just starting to get a little out of his depth here, but Oracle is just too f@#$~]g weird! I'm sure it is fantastic once you get to know it.

Collapse
Posted by leigh silvester on
Just come back to this after having been diverted to other things.
Been playing with your SQL so that it parses neatly with no errors or warnings.

One bit I am having trouble with during the initial "installation" phase is in procedure acs_rel_type__create_type.

It has a line...
insert into acs_rel_types ( rel_type, object_type_one, role_one, min_n_rels_one, max_n_rels_one, object_type_two, role_two, min_n_rels_two, max_n_rels_two ) values ( @create_type__rel_type , @create_type__object_type_one , @create_type__role_one , @create_type__min_n_rels_one , @create_type__max_n_rels_one , @create_type__object_type_two , @create_type__role_two , @create_type__min_n_rels_two , @create_type__max_n_rels_two )

When this is called via acs_rel_type__create_type in the procedure inline_0 in acs-relationships-create.sql section it causes via a temporary procedure I fabricated to replace something that appears to have been missed in your translation...

CREATE PROCEDURE dbo.inline_0
@ms_return_value integer = 0 OUTPUT
AS
BEGIN
DECLARE @attr_id int
EXEC dbo.obs_rel_type__create_type
'relationship', --@create_type__rel_type
'Relationship', --@create_type__pretty_name
'Relationships', --@create_type__pretty_plural
'obs_object', --@create_type__supertype
'obs_rels', --@create_type__table_name
'rel_id', --@create_type__id_column
'obs_rel', --@create_type__package_name
'obs_rel_types', --@create_type__object_type_one
'obs_object', --@create_type__role_one
0, --@create_type__min_n_rels
null, --@create_type__max_n_rels_one
'obs_object', --@create_type__object_type_two
'', --@create_type__role_two
0, --@create_type__min_n_rels_two
null, --@create_type__max_n_rels_two
@ms_return_value = @attr_id OUTPUT
RETURN 0
END -- stored proc
GO

... I get the error:

INSERT statement conflicted with COLUMN FOREIGN KEY constraint 'acs_rel_types_obj_type_1_fk'.
The conflict occurred in database 'orgbase', table 'acs_object_types', column 'object_type'.

The problem seems to be with the foreign keys for table acs_rel_types which are conflicated when the relationship type is added?

Any thought anyone?

I am nearly at a stage where I can post the extended code from John's original somewhere.
Is any one interested in this?