csv file UTF-8 (with BOM) to ANSI / Windows-1251 -


i'm looking create batch file / macro remove first line of auto generated utf-8 csv , convert windows code page 1251 ("ansi"). i've been looking on internet , tried lot of things, can't find 1 works...

removing first line simple enough

@echo off set "csv=test.csv" more +1 "%csv%" >"%csv%.new" move /y "%csv%.new" "export\%csv%" >nul 

after i'm lost, ive tried using type set dos

cmd /a /c type test.csv > ansi.csv 

and many variations on this, either returns empty cp1251 file or utf file.

i've tried using vbs returned utf-8 file without bom

option explicit  private const adreadall = -1 private const adsavecreateoverwrite = 2 private const adtypebinary = 1 private const adtypetext = 2 private const adwritechar = 0  private sub utf8toansi(byval utf8fname, byval ansifname)     dim strtext      createobject("adodb.stream")         .open         .type = adtypebinary         .loadfromfile utf8fname         .type = adtypetext         .charset = "utf-8"         strtext = .readtext(adreadall)         .position = 0         .seteos         .charset = "_autodetect" 'use current ansi codepage.         .writetext strtext, adwritechar         .savetofile ansifname, adsavecreateoverwrite         .close     end end sub  utf8toansi "utf8-wbom.txt", "ansi1.txt" utf8toansi "utf8-nobom.txt", "ansi2.txt" msgbox "complete!", vbokonly, wscript.scriptname 

edit1: tried converting iso-8859-1 instead of cp1251 using vbs

option explicit  private const adreadall = -1 private const adsavecreateoverwrite = 2 private const adtypebinary = 1 private const adtypetext = 2 private const adwritechar = 0  private sub utf8toansi(byval utf8fname, byval ansifname)   dim strtext    createobject("adodb.stream")     .open     .type = adtypebinary     .loadfromfile utf8fname     .type = adtypetext     .charset = "utf-8"     strtext = .readtext(adreadall)     .position = 0     .seteos     .charset = "iso-8859-1"     .writetext strtext, adwritechar     .savetofile ansifname, adsavecreateoverwrite     .close   end end sub  utf8toansi wscript.arguments(0), wscript.arguments(1) 

this did not work.

edit 2: found way convert files utf ansi using stringconverter.exe (downloaded http://www.computerperformance.co.uk/ezine/tools.htm )

setlocal set _source=c:\users\lloyd.evd\delfirstbat\import set _dest=c:\users\lloyd.evd\delfirstbat\export /f "tokens=*" %%i in ('dir /b /a-d "%_source%\*.csv"') stringconverter "%_source%\%%~nxi" "%_dest%\%%~nxi" /ansi 

how ever when remove first line of file (either before or after, doesn't matter) becomes utf-8 without bom again.

so should need script del first row without changing charset.

next vbscript help: procedure utf8toansi converts utf-8 encoded text file encoding.

option explicit  private const adreadall = -1 private const adsavecreateoverwrite = 2 private const adtypebinary = 1 private const adtypetext = 2 private const adwritechar = 0  private sub utf8toansi(byval utf8fname, byval ansifname, byval ansicharset)   dim strtext    createobject("adodb.stream")     .type = adtypetext      .charset = "utf-8"     .open     .loadfromfile utf8fname     strtext = .readtext(adreadall)     .close      .charset = ansicharset     .open     .writetext strtext, adwritechar     .savetofile ansifname, adsavecreateoverwrite     .close   end end sub  'utf8toansi wscript.arguments(0), wscript.arguments(1) utf8toansi "d:\test\so\38835837utf8.csv", "d:\test\so\38835837ansi1250.csv", "windows-1250" utf8toansi "d:\test\so\38835837utf8.csv", "d:\test\so\38835837ansi1251.csv", "windows-1251" utf8toansi "d:\test\so\38835837utf8.csv", "d:\test\so\38835837ansi1253.csv", "windows-1253" 

for list of character set names known system, see subkeys of hkey_classes_root\mime\database\charset in windows registry:

for /f "tokens=5* delims=\" %# in ('reg query hkcr\mime\database\charset') @echo "%#" 

data (38835837utf8.csv file):

1st line    1250    852 čeština (Čechie) 2nd line    1251    966 русский (Россия) 3rd line    1253    737 ελληνικά (Ελλάδα) 

output shows characters can't converted particular character set converted using character decomposition mapping (č=>c, š=>s, Č=>c etc.); if not applicable converted ? question mark (common replacement character):

==> chcp 1250 active code page: 1250  ==> type d:\test\so\38835837ansi1250.csv 1st line        1250    852     čeština (Čechie) 2nd line        1251    966     ??????? (??????) 3rd line        1253    737     ???????? (??????)  ==> chcp 1251 active code page: 1251  ==> type d:\test\so\38835837ansi1251.csv 1st line        1250    852     cestina (cechie) 2nd line        1251    966     русский (Россия) 3rd line        1253    737     ???????? (??????)  ==> chcp 1253 active code page: 1253  ==> type d:\test\so\38835837ansi1253.csv 1st line        1250    852     cestina (cechie) 2nd line        1251    966     ??????? (??????) 3rd line        1253    737     ελληνικά (Ελλάδα) 

Comments