Searching Across Tables In SQL Server

I recently had to search many tables for a list of values in SQL Server. Here's how I used dynamic SQL to achieve the result I needed.

Posted October 12, 2021 (Revised October 24, 2021) SQL Server Data Engineering

I was recently tasked with a data cleanup assignment in Microsoft SQL Server. We had a group of records to delete from a master table, but there was related data on over twenty other different tables. Before the records could be deleted from the master table, the related tables all needed to be queried to see if any linked rows needed to be reviewed.

Querying these 20+ tables individually would be a great deal of manual effort, of course. Fortunately, SQL Server gives us a better way, using the INFORMATION_SCHEMA.

What is INFORMATION_SCHEMA?

SQL Server's INFORMATION_SCHEMA gives us a way to view metadata about a database. It provides information about tables, columns, views, schemata, etc. Click here to view the documentation on INFORMATION_SCHEMA from Microsoft.

As an example of how we can utilize INFORMATION_SCHEMA, let's say we want to get a list of all tables that contain the word "PERSON" in them. You could do so with the following query.

SELECT 
  TABLE_NAME
FROM
  INFORMATION_SCHEMA.TABLES
WHERE
  TABLE_NAME LIKE '%PERSON%';

If you have multiple databases on your SQL Server instance, you could also include TABLE_CATALOG in your where clause to restrict the database(s) you want to look at.

By using INFORMATION_SCHEMA, I was able to programmatically generate the list of related tables I needed to go query. Note that if the tables you are looking at do not have any kind of standard naming convention, this may not be beneficial in your case, and you'll likely have to generate the list of tables to go check manually. But, if you are querying tables that have a common pattern in their naming convention, this can save you a good deal of time and effort.

Putting it into Practice

Here is the full code, and then I'll go piece by piece through it:

DROP TABLE IF EXISTS #DataToReview

CREATE TABLE #DataToReview
(
src_table varchar(100),
user_id numeric(10,0)
)

DECLARE @table VARCHAR(100);
DECLARE @sqlCommand VARCHAR(2000);

DECLARE src_cursor CURSOR FOR
    SELECT 
        TABLE_NAME
    FROM
        INFORMATION_SCHEMA.TABLES
	WHERE
        TABLE_CATALOG = 'CORE'
	    AND TABLE_NAME LIKE '%USER_CORE%'

OPEN src_cursor
FETCH NEXT FROM src_cursor INTO @table

WHILE @@FETCH_STATUS = 0
BEGIN
	SET @sqlCommand = 'INSERT INTO #DataToReview (src_table, user_id)
	SELECT DISTINCT
		''' + @table + ''',
		T.USER_ID
	FROM
		[CORE].[dbo].[ITEMS_TO_DELETE] T
	JOIN [CORE].[dbo].' + @table + ' TBL
		ON TBL.USER_ID = T.USER_ID'

	EXEC (@sqlCommand)

    FETCH NEXT FROM src_cursor INTO @table
END

CLOSE src_cursor
DEALLOCATE src_cursor

SELECT * FROM #DataToReview
GO

Let's look at this, piece by piece:

DROP TABLE IF EXISTS #DataToReview

CREATE TABLE #DataToReview
(
src_table varchar(100),
user_id numeric(10,0)
)

At the outset, I'm creating a temporary table to hold my final results (data from my source set of records to remove from the parent table, that also exists in any of the child tables). I'm dropping the table in case it already exists, and then recreating it fresh. I'm populating it with two columns. In addition to user_id, which will hold the user's id that matches the data in the source set, I'm also including the src_table to hold the table name that the matching record appears in.

DECLARE @table VARCHAR(100);
DECLARE @sqlCommand VARCHAR(2000);

DECLARE src_cursor CURSOR FOR
    SELECT 
        TABLE_NAME
    FROM
        INFORMATION_SCHEMA.TABLES
	WHERE
        TABLE_CATALOG = 'CORE'
	    AND TABLE_NAME LIKE '%USER_CORE%'

OPEN src_cursor

Next, I'm declaring two variables and a SQL cursor. The variable @table will be used the process of iterating through the tables we need to query for matches, and the @sqlCommand variable will hold dynamically generated SQL queries to check for matching data.

Finally, src_cursor creates a cursor, which is a way to iterate through the rows of a result set in SQL Server. Using a cursor, we can go through the query results one row at a time (in addition to other advanced options such as moving backwards through the results as well). For more on cursors in SQL Server, click here. In our case, we are running a query against the previously explained INFORMATION_SCHEMA.TABLES - I'm looking in the CORE database and trying to locate all tables that contain USER_CORE in the name. Note that so far, we have not obtained any final results yet, we've only gotten the list of tables we need to go and query.

FETCH NEXT FROM src_cursor INTO @table

WHILE @@FETCH_STATUS = 0
BEGIN
	SET @sqlCommand = 'INSERT INTO #DataToReview (src_table, user_id)
	SELECT DISTINCT
		''' + @table + ''',
		T.USER_ID
	FROM
		[CORE].[dbo].[ITEMS_TO_DELETE] T
	JOIN [CORE].[dbo].' + @table + ' TBL
		ON TBL.USER_ID = T.USER_ID'

	EXEC (@sqlCommand)

    FETCH NEXT FROM src_cursor INTO @table
END

CLOSE src_cursor
DEALLOCATE src_cursor

Now we are into the thick of it. The first statement will fetch the next row of the cursor query's result (in this case, the first row), storing it in the @table variable. Next, it checks the global SQL Server function @@FETCH_STATUS, which will return the status of the last cursor fetch. I will say that @@FETCH_STATUS is a bit strange; while you might expect it to return a value of 1 if the last fetch returned a row, 0 is, in fact, the result we are looking for. If there is an error or we've reached the end of the result set, a negative value is returned. Click here for more information on @@FETCH_STATUS, how it's used, and what the function returns.

In our code, then if we get that 0 value back for @@FETCH_STATUS, the code between the BEGIN and END block is executed. We are dynamically generating a SQL query that is going to insert its results into the temporary table we created back at the beginning. This query is going to take the ITEMS_TO_DELETE table, which is a table we created only holding the user_id values to delete from the parent table. It then joins to whichever table we are currently looking at in the list the cursor is iterating through, using the user_id as the common field. Once the inserting query has been dynamically generated and stored in @sqlCommand, the EXEC statement is used to carry out the command. Just prior to the end of the block, the FETCH NEXT status grabs the next row, and the block will repeat as long as there is a next row in the result set. Once the end is reached, the cursor is closed and deallocated.

SELECT * FROM #DataToReview
GO

Finally, we conclude our code by selecting the results from our DataToReview temp table. This will give us all of the user_id's that appear in any of the tables we searched, along with the table that they showed up in.

Wrapping Up

The above might not necessarily be a specific scenario you face in SQL Server, but knowing how to dynamically search INFORMATION_SCHEMA for tables, columns, and more is a very useful thing to have in your skillset. Likewise, being able to iterate through results with cursors and build SQL statements dynamically can help save you a great deal of time.

Have questions or suggestions on other ways to solve this problem? Feel free to comment below or reach out on Twitter!

Enjoyed this post? Consider buying me a coffee to fuel the creation of more content.

Twinkly LED Lights Review

Making Your First Open Source Contribution