如何從xml文件創(chuàng)建R語(yǔ)言數(shù)據(jù)框dataframe
原文鏈接:http://tecdat.cn/?p=16788
原文出處:拓端數(shù)據(jù)部落公眾號(hào)
?
問(wèn)題重現(xiàn)?
軟件:R語(yǔ)言
環(huán)境:windows
問(wèn)題描述:我有一個(gè)XML文檔文件。文件的一部分如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<List>
<SubCategory>
<ID>BO</ID>
<Name>Bookcases</Name>
</SubCategory>
<SubCategory>
<ID>CH</ID>
<Name>Chairs</Name>
</SubCategory>
<SubCategory>
<ID>LA</ID>
<Name>Labels</Name>
</SubCategory>
<SubCategory>
<ID>TA</ID>
<Name>Tables</Name>
</SubCategory>
<SubCategory>
<ID>ST</ID>
<Name>Storage</Name>
</SubCategory>
<SubCategory>
<ID>FU</ID>
<Name>Furnishings</Name>
</SubCategory>
<SubCategory>
<ID>AR</ID>
<Name>Art</Name>
</SubCategory>
<SubCategory>
<ID>PH</ID>
<Name>Phones</Name>
</SubCategory>
<SubCategory>
<ID>BI</ID>
<Name>Binders</Name>
</SubCategory>
<SubCategory>
<ID>AP</ID>
<Name>Appliances</Name>
</SubCategory>
<SubCategory>
<ID>PA</ID>
<Name>Paper</Name>
</SubCategory>
<SubCategory>
<ID>AC</ID>
<Name>Accessories</Name>
</SubCategory>
<SubCategory>
<ID>EN</ID>
<Name>Envelopes</Name>
</SubCategory>
<SubCategory>
<ID>FA</ID>
<Name>Fasteners</Name>
</SubCategory>
<SubCategory>
<ID>SU</ID>
<Name>Supplies</Name>
</SubCategory>
<SubCategory>
<ID>MA</ID>
<Name>Machines</Name>
</SubCategory>
<SubCategory>
<ID>CO</ID>
<Name>Copiers</Name>
</SubCategory>
</List>
從這個(gè)XML文件中,我想創(chuàng)建一個(gè)具有ID,name 列的R數(shù)據(jù)框。請(qǐng)注意,name和ID應(yīng)包含變量的所有級(jí)別。
解決方案
假設(shè)這是正確的taxlots.shp.xml
文件:
<?xml version="1.0" encoding="UTF-8"?>
<List>
<SubCategory>
<ID>BO</ID>
<Name>Bookcases</Name>
</SubCategory>
<SubCategory>
<ID>CH</ID>
<Name>Chairs</Name>
</SubCategory>
<SubCategory>
<ID>LA</ID>
<Name>Labels</Name>
</SubCategory>
<SubCategory>
<ID>TA</ID>
<Name>Tables</Name>
</SubCategory>
<SubCategory>
<ID>ST</ID>
<Name>Storage</Name>
</SubCategory>
<SubCategory>
<ID>FU</ID>
<Name>Furnishings</Name>
</SubCategory>
<SubCategory>
<ID>AR</ID>
<Name>Art</Name>
</SubCategory>
<SubCategory>
<ID>PH</ID>
<Name>Phones</Name>
</SubCategory>
<SubCategory>
<ID>BI</ID>
<Name>Binders</Name>
</SubCategory>
<SubCategory>
<ID>AP</ID>
<Name>Appliances</Name>
</SubCategory>
<SubCategory>
<ID>PA</ID>
<Name>Paper</Name>
</SubCategory>
<SubCategory>
<ID>AC</ID>
<Name>Accessories</Name>
</SubCategory>
<SubCategory>
<ID>EN</ID>
<Name>Envelopes</Name>
</SubCategory>
<SubCategory>
<ID>FA</ID>
<Name>Fasteners</Name>
</SubCategory>
<SubCategory>
<ID>SU</ID>
<Name>Supplies</Name>
</SubCategory>
<SubCategory>
<ID>MA</ID>
<Name>Machines</Name>
</SubCategory>
<SubCategory>
<ID>CO</ID>
<Name>Copiers</Name>
</SubCategory>
</List>
XML格式的數(shù)據(jù)很少以允許該xmlToDataFrame
功能正常工作的方式進(jìn)行組織。最好提取列表中的所有內(nèi)容,然后將列表綁定到數(shù)據(jù)框中:
?
?
data <- xmlParse("ProductSubcategory.xml")
xml_data <- xmlToList(data)
dataDictionary <- xmlToDataFrame(getNodeSet(data,"//SubCategory"))


最受歡迎的見(jiàn)解
1.如何解決線(xiàn)性混合模型中畸形擬合(SINGULAR FIT)的問(wèn)題
2.在UBUNTU虛擬機(jī)上安裝R軟件包
3.WINDOWS中用命令行執(zhí)行R語(yǔ)言命令
4.R語(yǔ)言GGSURVPLOT繪制生存曲線(xiàn)報(bào)錯(cuò) : OBJECT OF TYPE ‘SYMBOL‘ IS NOT SUBSETTABLE