Navicat Blog

Counting String Occurrences in SQL June 5, 2020 by Robert Gravelle

Although not as proficient at string manipulation as procedural programming languages such as Java, C++, and PHP, SQL does provide many functions for working with string data. These may be employed to trim off extra spaces or characters, determine how long a string is, and concatenate several field values together. String functions are well worth becoming acquainted with as they can help make your code more effective and readable. In today's blog, we'll learn how to count the number of string occurrences within a char, varchar or text field using a couple of native SQL string functions.

Introducing the LENGTH() and REPLACE() Functions

The two functions that we'll be using here today are LENGTH(str) and REPLACE(str, from_str, to_str). LENGTH() returns the length of a string in bytes; REPLACE() returns the string str with all occurrences of the string from_str replaced by the string to_str, by performing case-sensitive matching.

The LENGTH() function returns the length of a string in bytes. This has some important ramifications because it means that for a string containing five 2-byte characters, LENGTH() returns 10. To count straight characters, use CHAR_LENGTH() instead.

Here's an example:

length_function (33K)

Here's an example of the REPLACE() function that changes the protocol of a URL from "http" to "https":

replace_function (41K)

Let's Get Counting

By combining LENGTH() and REPLACE() with the ROUND() function, we can obtain a count of a specific sub-string in a field that contains textual content. Here's an example using the Sakila Sample database that returns the count of the word "Documentary" in the description field of the film table:

count_occurrences (172K)

In essence, our query replaces occurrences of the target sub-string with an empty ("") string and compares the resulting string lengths. The difference between them is the number of occurrences of the sub-string in the source field.

Incorporating Our Query Into a User Function

If you plan on performing word counts on many different tables or using a variety of sub-string values, you should consider incorporating the main calculation into a custom User Function. Here's a function, named `count_string_instances`, that I created in Navicat:

count_occurrences_function (84K)

Testing the Function

We can test our function in-place by clicking the Execute button. That opens a dialog to accept input parameters:

input_param_dialog (21K)

The results confirm that the function is working correctly:

count_occurrences_function_test_result (18K)

Invoking Our Function from the Query

With our function in place, we can replace the calculation portion of the query with a call to the count_string_instances() function. As we begin to type the function name, the Navicat auto-suggest list now includes our function!

auto_complete (49K)

As with all functions, it is inserted into our query with input parameters ready to set. We can navigate between them via the TAB key:

auto_complete_fields (43K)

Here's the updated query with results:

query_with_function (181K)

Conclusion

There are many SQL string functions that can help make your code more effective and readable. These can be especially powerful when combined. In today's blog, we learn how to count the number of string occurrences within a char, varchar or text field by creating a custom user function using Navicat's versatile Function and Stored Procedure Editor.

Navicat Blogs
Feed Entries
Blog Archives
Share