i'm looking create batch file / macro remove first line of auto generated utf-8 csv , convert windows code page 1251 ("ansi"). i've been looking on internet , tried lot of things, can't find 1 works...
removing first line simple enough
@echo off set "csv=test.csv" more +1 "%csv%" >"%csv%.new" move /y "%csv%.new" "export\%csv%" >nul
after i'm lost, ive tried using type set dos
cmd /a /c type test.csv > ansi.csv
and many variations on this, either returns empty cp1251 file or utf file.
i've tried using vbs returned utf-8 file without bom
option explicit private const adreadall = -1 private const adsavecreateoverwrite = 2 private const adtypebinary = 1 private const adtypetext = 2 private const adwritechar = 0 private sub utf8toansi(byval utf8fname, byval ansifname) dim strtext createobject("adodb.stream") .open .type = adtypebinary .loadfromfile utf8fname .type = adtypetext .charset = "utf-8" strtext = .readtext(adreadall) .position = 0 .seteos .charset = "_autodetect" 'use current ansi codepage. .writetext strtext, adwritechar .savetofile ansifname, adsavecreateoverwrite .close end end sub utf8toansi "utf8-wbom.txt", "ansi1.txt" utf8toansi "utf8-nobom.txt", "ansi2.txt" msgbox "complete!", vbokonly, wscript.scriptname
edit1: tried converting iso-8859-1 instead of cp1251 using vbs
option explicit private const adreadall = -1 private const adsavecreateoverwrite = 2 private const adtypebinary = 1 private const adtypetext = 2 private const adwritechar = 0 private sub utf8toansi(byval utf8fname, byval ansifname) dim strtext createobject("adodb.stream") .open .type = adtypebinary .loadfromfile utf8fname .type = adtypetext .charset = "utf-8" strtext = .readtext(adreadall) .position = 0 .seteos .charset = "iso-8859-1" .writetext strtext, adwritechar .savetofile ansifname, adsavecreateoverwrite .close end end sub utf8toansi wscript.arguments(0), wscript.arguments(1)
this did not work.
edit 2: found way convert files utf ansi using stringconverter.exe (downloaded http://www.computerperformance.co.uk/ezine/tools.htm )
setlocal set _source=c:\users\lloyd.evd\delfirstbat\import set _dest=c:\users\lloyd.evd\delfirstbat\export /f "tokens=*" %%i in ('dir /b /a-d "%_source%\*.csv"') stringconverter "%_source%\%%~nxi" "%_dest%\%%~nxi" /ansi
how ever when remove first line of file (either before or after, doesn't matter) becomes utf-8 without bom again.
so should need script del first row without changing charset.
next vbscript help: procedure utf8toansi
converts utf-8
encoded text file encoding.
option explicit private const adreadall = -1 private const adsavecreateoverwrite = 2 private const adtypebinary = 1 private const adtypetext = 2 private const adwritechar = 0 private sub utf8toansi(byval utf8fname, byval ansifname, byval ansicharset) dim strtext createobject("adodb.stream") .type = adtypetext .charset = "utf-8" .open .loadfromfile utf8fname strtext = .readtext(adreadall) .close .charset = ansicharset .open .writetext strtext, adwritechar .savetofile ansifname, adsavecreateoverwrite .close end end sub 'utf8toansi wscript.arguments(0), wscript.arguments(1) utf8toansi "d:\test\so\38835837utf8.csv", "d:\test\so\38835837ansi1250.csv", "windows-1250" utf8toansi "d:\test\so\38835837utf8.csv", "d:\test\so\38835837ansi1251.csv", "windows-1251" utf8toansi "d:\test\so\38835837utf8.csv", "d:\test\so\38835837ansi1253.csv", "windows-1253"
for list of character set names known system, see subkeys of hkey_classes_root\mime\database\charset
in windows registry:
for /f "tokens=5* delims=\" %# in ('reg query hkcr\mime\database\charset') @echo "%#"
data (38835837utf8.csv
file):
1st line 1250 852 čeština (Čechie) 2nd line 1251 966 русский (Россия) 3rd line 1253 737 ελληνικά (Ελλάδα)
output shows characters can't converted particular character set converted using character decomposition mapping (č
=>c
, š
=>s
, Č
=>c
etc.); if not applicable converted ?
question mark (common replacement character):
==> chcp 1250 active code page: 1250 ==> type d:\test\so\38835837ansi1250.csv 1st line 1250 852 čeština (Čechie) 2nd line 1251 966 ??????? (??????) 3rd line 1253 737 ???????? (??????) ==> chcp 1251 active code page: 1251 ==> type d:\test\so\38835837ansi1251.csv 1st line 1250 852 cestina (cechie) 2nd line 1251 966 русский (Россия) 3rd line 1253 737 ???????? (??????) ==> chcp 1253 active code page: 1253 ==> type d:\test\so\38835837ansi1253.csv 1st line 1250 852 cestina (cechie) 2nd line 1251 966 ??????? (??????) 3rd line 1253 737 ελληνικά (Ελλάδα)
Comments
Post a Comment