Is it possible replace the value of a cell in a csv file using grep,sed or both

0

I have written the following command

#!/bin/bash
awk -v value=$newvalue -v row=$rownum -v col=1 'BEGIN{FS=OFS=","} NR==row {$col=value}1' "${file}".csv >> temp.csv && mv temp.csv "${file}".csv

Sample Input of file.csv

Header,1
Field1,Field2,Field3
1,ABC,4567
2,XYZ,7890

Assuiming $newvalue=3 ,$rownum=4 and col=1, then the above code will replace:

Required Output

Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890

So if I know the row and column, is it possible to replace the said value using grep, sed?

Edit1: Field3 will always have a unique value for their respective rows. ( in case that info helps anyway)

bash csv git-bash linux
2021-11-24 06:52:47
3

1

Assuming your CSV file is as simple as what you show (no commas in quoted fields), and your newvalue does not contain characters that sed would interpret in a special way (e.g. ampersands, slashes or backslashes), the following should work with just sed (tested with GNU sed):

sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv

Demo:

$ cat file.csv
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
$ rownum=3
$ col=2
$ newvalue="NEW"
$ sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW,4567
3,XYZ,7890

Explanations: $rownum is used as the address (here the line number) where to apply the following command. s is the sed substitute command. [^,]* is the regular expression to search for and replace: the longest possible string not containing a comma. $newvalue is the replacement string. $col is the occurrence to replace.

If newvalue may contain ampersands, slashes or backslashes we must sanitize it first:

sanitizednewvalue=$(sed -E 's/([/\&])/\\\1/g' <<< "$newvalue")
sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv

Demo:

$ newvalue='NEW&\/&NEW'
$ sanitizednewvalue=$(sed -E 's/([/\&])/\\\1/g' <<< "$newvalue")
$ echo "$sanitizednewvalue"
NEW\&\\\/\&NEW
$ sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW&\/&NEW,4567
3,XYZ,7890
2021-11-24 11:13:43

This does work. Just a few pointers though: I was not aware before this answer of ` [^,]*` but if sed is able to replace for a specific cell, then why are we including [^,]* . I did try sed -Ei "$rownum s/$newvalue/$col" file.csv and it threw an error but Would like to know more about this. Any resource to read upon would be helpful as well.
Helium

We need ` [^,]*` because it is what defines what a cell is. sed is not a CSV processor, it is a any-text processor. So it has no knowledge of what you call a cell is. We must tell it. The sed substitute command (s) is explained in deep details in the sed manual that you will easily find (if you are under GNU/Linux or macOS try man sed or, even better, info sed). The substitute command you tried is syntactically incorrect, thus the error.
Renaud Pacalet

Yup, that makes more sense now, when putting it like that.
Helium
1

With sed, how about:

#!/bin/bash

newvalue=3
rownum=4
col=1

sed -i -E "${rownum} s/(([^,]+,){$((col-1))})[^,]+/\\1${newvalue}/" file.csv

Result of file.csv

Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
  • ${rownum} matches the line number.
  • (([^,]+,){n}) matches the n-time repetition of the group of non-comma characters followed by a comma. Then it should be the substring before the target (to be substituted) column by assigning n to col - 1.
2021-11-24 07:21:19

even though this does work, isn't this a bit more complicated way of doing things compared to how Renauld's answer. Like why do we need to match the n-time repetition if we can instead directly replace it? Helpful nevertheless
Helium
0

Let's Try to Implement sed command

Let us consider a sample CSV file with the following content:

$ cat file

Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
  1. To remove the 1st field or column :
$ sed 's/[^,]*,//' file

25,11
31,2
21,3
45,4
12,5

This regular expression searches for a sequence of non-comma([^,]*) characters and deletes them which results in the 1st field getting removed.

  1. To print only the last field, OR remove all fields except the last field:
$ sed 's/.*,//' file

11
2
3
4
5

This regex removes everything till the last comma(.*,) which results in deleting all the fields except the last field.

  1. To print only the 1st field:
$ sed 's/,.*//' file

Solaris
Ubuntu
Fedora
LinuxMint
RedHat

This regex(,.*) removes the characters starting from the 1st comma till the end resulting in deleting all the fields except the last field.

  1. To delete the 2nd field:
$ sed 's/,[^,]*,/,/' file

Solaris,11
Ubuntu,2
Fedora,3
LinuxMint,4
RedHat,5

The regex (,[^,]*,) searches for a comma and sequence of characters followed by a comma which results in matching the 2nd column, and replaces this pattern matched with just a comma, ultimately ending in deleting the 2nd column.

Note: To delete the fields in the middle gets more tougher in sed since every field has to be matched literally.

  1. To print only the 2nd field:
$ sed 's/[^,]*,\([^,]*\).*/\1/' file

25
31
21
45
12

The regex matches the first field, second field and the rest, however groups the 2nd field alone. The whole line is now replaced with the 2nd field(\1), hence only the 2nd field gets displayed.

  1. Print only lines in which the last column is a single digit number:
$ sed -n '/.*,[0-9]$/p' file

Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

The regex (,[0-9]$) checks for a single digit in the last field and the p command prints the line which matches this condition.

  1. To number all lines in the file:
$ sed = file | sed 'N;s/\n/ /'

1 Solaris,25,11
2 Ubuntu,31,2
3 Fedora,21,3
4 LinuxMint,45,4
5 RedHat,12,5

This is simulation of cat -n command. awk does it easily using the special variable NR. The '=' command of sed gives the line number of every line followed by the line itself. The sed output is piped to another sed command to join every 2 lines.

  1. Replace the last field by 99 if the 1st field is 'Ubuntu':
$ sed 's/\(Ubuntu\)\(,.*,\).*/\1\299/' file

Solaris,25,11
Ubuntu,31,99
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

This regex matches 'Ubuntu' and till the end except the last column and groups each of them as well. In the replacement part, the 1st and 2nd group along with the new number 99 is substituted.

  1. Delete the 2nd field if the 1st field is 'RedHat':
$ sed 's/\(RedHat,\)[^,]*\(.*\)/\1\2/' file

Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,,5

The 1st field 'RedHat', the 2nd field and the remaining fields are grouped, and the replacement is done with only 1st and the last group , resuting in getting the 2nd field deleted.

  1. To insert a new column at the end(last column) :
$ sed 's/.*/&,A/' file

Solaris,25,11,A
Ubuntu,31,2,A
Fedora,21,3,A
LinuxMint,45,4,A
RedHat,12,5,A

The regex (.*) matches the entire line and replacing it with the line itself (&) and the new field.

  1. To insert a new column in the beginning(1st column):
$ sed 's/.*/A,&/' file

A,Solaris,25,11
A,Ubuntu,31,2
A,Fedora,21,3
A,LinuxMint,45,4
A,RedHat,12,5

Same as last example, just the line matched is followed by the new column

I hope this will help. Let me know if you need to use Awk or any other command. Thank you

2021-11-24 07:36:29

thanks for detailed explanation but unfortunately it doesn't solve the issue at hand.
Helium

In other languages

This page is in other languages

Русский
..................................................................................................................
Italiano
..................................................................................................................
Polski
..................................................................................................................
Română
..................................................................................................................
한국어
..................................................................................................................
हिन्दी
..................................................................................................................
Français
..................................................................................................................
Türk
..................................................................................................................
Česk
..................................................................................................................
Português
..................................................................................................................
ไทย
..................................................................................................................
中文
..................................................................................................................
Español
..................................................................................................................
Slovenský
..................................................................................................................