PrevPrev Go to previous topic
NextNext Go to next topic
Last Post 06/18/2019 12:49 PM by  steve finger
Parsing CSV with double quotes
 9 Replies
Sort:
You are not authorized to post a reply.
Author Messages
AC_Slater
Software Engineer
Private
Basic Member
(11 points)
Basic Member
Posts:5


Send Message:

--
06/09/2019 1:09 PM

    Hi All,

    I'm hoping someone else might have had to tackle a similar issue, and seeing as though my regex skills are pretty rusty, I thought I'd reach out.

    I am currently using the Data Iterator to go through a CSV file to break up the file by line.  Normally, I would use a second data iterator to go through and break up the fields by their delimiter, which is a comma, but unfortunately some of the fields have valid commas within their data:

    For instance:

    1234,Cake,,"Smith,John",,"Status: Acknowledge,Accept",,Red,,

     

    In the example above, "Smith,John" and "Status: Acknowledge, Accept" should both not be split by the comma.  My hope is that someone might have some javascript experience that would assist me in getting the fields broken up correctly.  Please note that just because one line might have a relevant comma, that same field on the next line may not contain such a comma.

    For instance:

    5678,Car,,"Ross,Bob",,Status: Reject,,Blue,,

    I currently have the outputData of the line data iterator being assigned to a string using in Assign node.  My initial thought is finding some way to replace all the commas with pipes, so it would look something like this:

    1234|Cake||"Smith,John"||"Status, Acknowledge,Accept"||Red||

    5678|Car||"Ross,Bob"||Status: Reject||Blue||

    But I'm not sure the best way to approach it.  Any recommendations?  Thank you all for taking the time to read.

     

     

    Attachments
    pgallucci12
    Sr Application Analyst
    UPHS
    Basic Member
    (13 points)
    Basic Member
    Posts:5


    Send Message:

    --
    06/09/2019 1:23 PM
    Not sure if this is an option for you but I always ask for PIPE | field delimeters (instead of commas). That takes care of the embedded comma issue.
    David Williams
    Private
    Private
    Veteran Member
    (3160 points)
    Veteran Member
    Posts:1064


    Send Message:

    --
    06/09/2019 1:35 PM
    I thought th Data Iterator had an option for quoted text?
    David Williams
    AC_Slater
    Software Engineer
    Private
    Basic Member
    (11 points)
    Basic Member
    Posts:5


    Send Message:

    --
    06/09/2019 5:26 PM

    The CSV is actually created via the PO122, so unless we make a custom program and alter the 4GL, I think we are locked into comma-delimited.

     

    AC_Slater
    Software Engineer
    Private
    Basic Member
    (11 points)
    Basic Member
    Posts:5


    Send Message:

    --
    06/09/2019 5:34 PM

    I believe you can parse by a string, but since the values aren't static, I'm not sure how that would work.  And if I were to parse by just ", then I would run into issues where the dataOutput would look something like this:

     

    outputData1=1234,Cake,,
    outputData2=Smith,John
    outputData3=,,
    outputData4=Status: Acknowledge,Accept
    outputData5,,Red,,

     

    Then trying to accurately parse through the data would become increasingly difficult, as I wouldn't have logic that specified which fields had quotations.

    steve finger
    Senior Technical Consultant
    North State Consulting
    Veteran Member
    (112 points)
    Veteran Member
    Posts:38


    Send Message:

    --
    06/09/2019 5:42 PM

    It's been a little while since i've invoked "workdef", but if i'm not mistaken  you are allowed to select your delimiter character when defining the workfile.  Again....unless i'm mistaken, you don't even have to recompile the program.... but i'm not betting any money on that one.  

    AC_Slater
    Software Engineer
    Private
    Basic Member
    (11 points)
    Basic Member
    Posts:5


    Send Message:

    --
    06/09/2019 6:10 PM

    Interesting thought.  Do you mean jobdef though?  I updated the jobdef and changed the delimiter to a pipe.  Would fields with commas in them remain unchanged?

     

    I attached the jobdef config.

    Attachments
    steve finger
    Senior Technical Consultant
    North State Consulting
    Veteran Member
    (112 points)
    Veteran Member
    Posts:38


    Send Message:

    --
    06/09/2019 7:03 PM
    Like i said earlier....it's been a while since i've played in that sandbox....you need to empirically demonstrate that what i THINK happens is really the truth.

    If you change the delimiter using jobdef, it should hold for that JOB - until the delimiter is changed again for that JOB via jobdef. If you change the delimiter via workdef, that delimiter then becomes the default for that PROGRAM.

    Once again.....better test that. If as i mentioned earlier that you don't need to recompile for the delimiter to be changed, then that would support the notion that workdef would change the overall default delimiter and jobdef would change the delimiter for that job only.
    Karen Sheridan
    Private
    Private
    Veteran Member
    (328 points)
    Veteran Member
    Posts:124


    Send Message:

    --
    06/18/2019 12:24 PM
    Are you Unix or Windows? There is a Unix command that can be used to change the commas to pipes.  Google SED
    steve finger
    Senior Technical Consultant
    North State Consulting
    Veteran Member
    (112 points)
    Veteran Member
    Posts:38


    Send Message:

    --
    06/18/2019 12:49 PM
    it doesn't matter whether they are windows or unix. they are working with IPA which has javascript. There are various ways to accomplish the substitution using javascript
    You are not authorized to post a reply.