Splitting string into multiple rows in Oracle

This may be an improved way (also with regexp and connect by):

with temp as
(
    select 108 Name, 'test' Project, 'Err1, Err2, Err3' Error  from dual
    union all
    select 109, 'test2', 'Err1' from dual
)
select distinct
  t.name, t.project,
  trim(regexp_substr(t.error, '[^,]+', 1, levels.column_value))  as error
from 
  temp t,
  table(cast(multiset(select level from dual connect by  level <= length (regexp_replace(t.error, '[^,]+'))  + 1) as sys.OdciNumberList)) levels
order by name

EDIT:
Here is a simple (as in, “not in depth”) explanation of the query.

  1. length (regexp_replace(t.error, '[^,]+')) + 1 uses regexp_replace to erase anything that is not the delimiter (comma in this case) and length +1 to get how many elements (errors) are there.
  2. The select level from dual connect by level <= (...) uses a hierarchical query to create a column with an increasing number of matches found, from 1 to the total number of errors.

    Preview:

    select level, length (regexp_replace('Err1, Err2, Err3', '[^,]+'))  + 1 as max 
    from dual connect by level <= length (regexp_replace('Err1, Err2, Err3', '[^,]+'))  + 1
    
  3. table(cast(multiset(.....) as sys.OdciNumberList)) does some casting of oracle types.
    • The cast(multiset(.....)) as sys.OdciNumberList transforms multiple collections (one collection for each row in the original data set) into a single collection of numbers, OdciNumberList.
    • The table() function transforms a collection into a resultset.
  4. FROM without a join creates a cross join between your dataset and the multiset.
    As a result, a row in the data set with 4 matches will repeat 4 times (with an increasing number in the column named “column_value”).

    Preview:

    select * from 
    temp t,
    table(cast(multiset(select level from dual connect by  level <= length (regexp_replace(t.error, '[^,]+'))  + 1) as sys.OdciNumberList)) levels
    
  5. trim(regexp_substr(t.error, '[^,]+', 1, levels.column_value)) uses the column_value as the nth_appearance/ocurrence parameter for regexp_substr.
  6. You can add some other columns from your data set (t.name, t.project as an example) for easy visualization.

Some references to Oracle docs:

  • REGEXP_REPLACE
  • REGEXP_SUBSTR
  • Extensibility Constants, Types, and Mappings (OdciNumberList)
  • CAST (multiset)
  • Hierarchical Queries

Leave a Comment