Searching Across Tables In SQL Server
I recently had to search many tables for a list of values in SQL Server. Here's how I used dynamic SQL to achieve the result I needed.
I was recently tasked with a data cleanup assignment in Microsoft SQL Server. We had a group of records to delete from a master table, but there was related data on over twenty other different tables. Before the records could be deleted from the master table, the related tables all needed to be queried to see if any linked rows needed to be reviewed.
Querying these 20+ tables individually would be a great deal of manual effort, of course. Fortunately, SQL Server gives us a better way, using the
What is INFORMATION_SCHEMA?
INFORMATION_SCHEMA gives us a way to view metadata about a database. It provides information about tables, columns, views, schemata, etc. Click here to view the documentation on
INFORMATION_SCHEMA from Microsoft.
As an example of how we can utilize
INFORMATION_SCHEMA, let's say we want to get a list of all tables that contain the word "PERSON" in them. You could do so with the following query.
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME LIKE '%PERSON%';
If you have multiple databases on your SQL Server instance, you could also include
TABLE_CATALOG in your where clause to restrict the database(s) you want to look at.
INFORMATION_SCHEMA, I was able to programmatically generate the list of related tables I needed to go query. Note that if the tables you are looking at do not have any kind of standard naming convention, this may not be beneficial in your case, and you'll likely have to generate the list of tables to go check manually. But, if you are querying tables that have a common pattern in their naming convention, this can save you a good deal of time and effort.
Putting it into Practice
Here is the full code, and then I'll go piece by piece through it:
DROP TABLE IF EXISTS #DataToReview CREATE TABLE #DataToReview ( src_table varchar(100), user_id numeric(10,0) ) DECLARE @table VARCHAR(100); DECLARE @sqlCommand VARCHAR(2000); DECLARE src_cursor CURSOR FOR SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_CATALOG = 'CORE' AND TABLE_NAME LIKE '%USER_CORE%' OPEN src_cursor FETCH NEXT FROM src_cursor INTO @table WHILE @@FETCH_STATUS = 0 BEGIN SET @sqlCommand = 'INSERT INTO #DataToReview (src_table, user_id) SELECT DISTINCT ''' + @table + ''', T.USER_ID FROM [CORE].[dbo].[ITEMS_TO_DELETE] T JOIN [CORE].[dbo].' + @table + ' TBL ON TBL.USER_ID = T.USER_ID' EXEC (@sqlCommand) FETCH NEXT FROM src_cursor INTO @table END CLOSE src_cursor DEALLOCATE src_cursor SELECT * FROM #DataToReview GO
Let's look at this, piece by piece:
DROP TABLE IF EXISTS #DataToReview CREATE TABLE #DataToReview ( src_table varchar(100), user_id numeric(10,0) )
At the outset, I'm creating a temporary table to hold my final results (data from my source set of records to remove from the parent table, that also exists in any of the child tables). I'm dropping the table in case it already exists, and then recreating it fresh. I'm populating it with two columns. In addition to
user_id, which will hold the user's id that matches the data in the source set, I'm also including the
src_table to hold the table name that the matching record appears in.
DECLARE @table VARCHAR(100); DECLARE @sqlCommand VARCHAR(2000); DECLARE src_cursor CURSOR FOR SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_CATALOG = 'CORE' AND TABLE_NAME LIKE '%USER_CORE%' OPEN src_cursor
Next, I'm declaring two variables and a SQL cursor. The variable
@table will be used the process of iterating through the tables we need to query for matches, and the
@sqlCommand variable will hold dynamically generated SQL queries to check for matching data.
src_cursor creates a cursor, which is a way to iterate through the rows of a result set in SQL Server. Using a cursor, we can go through the query results one row at a time (in addition to other advanced options such as moving backwards through the results as well). For more on cursors in SQL Server, click here. In our case, we are running a query against the previously explained
INFORMATION_SCHEMA.TABLES - I'm looking in the
CORE database and trying to locate all tables that contain
USER_CORE in the name. Note that so far, we have not obtained any final results yet, we've only gotten the list of tables we need to go and query.
FETCH NEXT FROM src_cursor INTO @table WHILE @@FETCH_STATUS = 0 BEGIN SET @sqlCommand = 'INSERT INTO #DataToReview (src_table, user_id) SELECT DISTINCT ''' + @table + ''', T.USER_ID FROM [CORE].[dbo].[ITEMS_TO_DELETE] T JOIN [CORE].[dbo].' + @table + ' TBL ON TBL.USER_ID = T.USER_ID' EXEC (@sqlCommand) FETCH NEXT FROM src_cursor INTO @table END CLOSE src_cursor DEALLOCATE src_cursor
Now we are into the thick of it. The first statement will fetch the next row of the cursor query's result (in this case, the first row), storing it in the
@table variable. Next, it checks the global SQL Server function
@@FETCH_STATUS, which will return the status of the last cursor fetch. I will say that
@@FETCH_STATUS is a bit strange; while you might expect it to return a value of
1 if the last fetch returned a row,
0 is, in fact, the result we are looking for. If there is an error or we've reached the end of the result set, a negative value is returned. Click here for more information on
@@FETCH_STATUS, how it's used, and what the function returns.
In our code, then if we get that
0 value back for
@@FETCH_STATUS, the code between the
END block is executed. We are dynamically generating a SQL query that is going to insert its results into the temporary table we created back at the beginning. This query is going to take the
ITEMS_TO_DELETE table, which is a table we created only holding the
user_id values to delete from the parent table. It then joins to whichever table we are currently looking at in the list the cursor is iterating through, using the
user_id as the common field. Once the inserting query has been dynamically generated and stored in
EXEC statement is used to carry out the command. Just prior to the end of the block, the
FETCH NEXT status grabs the next row, and the block will repeat as long as there is a next row in the result set. Once the end is reached, the cursor is closed and deallocated.
SELECT * FROM #DataToReview GO
Finally, we conclude our code by selecting the results from our
DataToReview temp table. This will give us all of the user_id's that appear in any of the tables we searched, along with the table that they showed up in.
The above might not necessarily be a specific scenario you face in SQL Server, but knowing how to dynamically search
INFORMATION_SCHEMA for tables, columns, and more is a very useful thing to have in your skillset. Likewise, being able to iterate through results with cursors and build SQL statements dynamically can help save you a great deal of time.
Have questions or suggestions on other ways to solve this problem? Feel free to comment below or reach out on Twitter!
Enjoyed this post? Consider buying me a coffee to fuel the creation of more content.