This topic has been archived. It cannot be replied.
工作学习 / IT技术讨论 / Help: How to get pure text from HTML page?
-mrviceroy(杀人者Daniel是也);
2001-5-8{141}(#64322@0)
I don't want to write my own codes to get rid of all tags because that's not accurate. Is there any easy way to do this job?
Thanks a lot.
it is easy, open your html file in browse, seelct all text in the brwose, paste then into txt file
-summit(Joker);
2001-5-8(#64324@0)
code! i want to make it work in automation.
-mrviceroy(杀人者Daniel是也);
2001-5-8(#64339@0)
try perl module, i remember there is one perl module can do this. but i don't have my perl book with me, check www.perl.com
-ingrid(樱桃);
2001-5-8(#64348@0)
goto www.yahoo.com , find the keyword html2txt
-summit(Joker);
2001-5-8(#64349@0)
sed 's/<[^>]*>//g' input.html > out.txt
-ztech(ztech);
2001-5-31(#85874@0)