Well, here's a solution. If you have any ideas on how to improve this, or re-write it altogether to be more efficient or effective, please share!
The first section of code here creates a temporary table and inserts into the temporary table the list of files of the targeted directory. In this case, we're looking at the bill directory on the Elite application server. You'll need to modify this path according to your environment. One note: the collation in the CREATE TABLE needs to be the same as the collation of your Document_Catalog table otherwise you may get errors when joining the two.
USE [son_db] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO --Drop temporary table if it already exists IF OBJECT_ID(N'tempdb.dbo.#DirContents', N'U') IS NOT NULL DROP TABLE tempdb.dbo.#DirContents; GO --Create temporary table to store contents of directory CREATE TABLE #DirContents ( Line VARCHAR(512) COLLATE SQL_Latin1_General_CP1_CS_AS NULL) DECLARE @PathName VARCHAR(256) , @CMD VARCHAR(512) SET @PathName = '\\eliteappserver\elite\home\bill\' SET @CMD = 'DIR ' + @PathName + ' /TC' PRINT @CMD INSERT INTO #DirContents EXEC MASTER..xp_cmdshell @CMD
The results from above should appear like this:
Database name 'tempdb' ignored, referencing object in tempdb. DIR \\eliteappserver\elite\home\bill\ /TC (32568 row(s) affected)
By taking a quick look at the temporary table you should see this:
Line |
---|
Volume in drive \\eliteappserver\elite\home\bill is Data_LUN72 |
Volume Serial Number is ABCD-E123 |
NULL |
Directory of \\eliteappserver\elite\home\bill |
NULL |
07/17/2009 07:34 PM <DIR> . |
07/17/2009 07:34 PM <DIR> .. |
10/06/2008 01:09 PM 41,270 000001.000001.481826.10062008.pdf |
09/03/2008 01:00 PM 42,434 000567.000026.471323.06052008.pdf |
09/03/2008 01:00 PM 42,680 000567.000026.473887.07072008.pdf |
These next few lines of code will strip out the non-file lines from the temporary table.
DELETE FROM #DirContents WHERE Line NOT LIKE '[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %' OR Line LIKE '%<DIR>%' OR Line is null
Next we need to compare the files in the directory (temp table) to the records in the Document_Catalog.
WITH realfiles AS ( SELECT 'DocName' = REVERSE( LEFT(REVERSE(Line),CHARINDEX(' ',REVERSE(line))-1 ) ), 'CreateDate' = LEFT(Line,10) FROM #DirContents WHERE LEFT(Line,10) LIKE '12/%%/2009' ), doccatfiles AS ( SELECT EMP_ID ,REVERSE(LEFT(REVERSE(CAST(FILE_PATH AS VARCHAR(400))),CHARINDEX('\',REVERSE(CAST(FILE_PATH AS VARCHAR(400))))-1)) AS 'DocName' FROM Document_Catalog WHERE FILE_DATE >= '12/1/09' AND DESC_DATE_2 >= '12/1/09' ) SELECT DISTINCT doccatfiles.EMP_ID FROM doccatfiles WHERE doccatfiles.DocName NOT IN (SELECT realfiles.DocName FROM realfiles)Note: Since we're doing so much text manipulation and comparing long strings of text, this query is rather slow. You'll see I limited the inputs here to only December 2009 to make the information more manageable. If you have a large issue, I would recommend separating it into batches like this.
You now have a list of invoices that are missing from your bill location. You can now recap these specific invoices rather than target all within a certain date range of the error.
Thanks for running this blog! I've been looking for some community around Whitehill (specifically, around template development and maintenance).
ReplyDelete