The practical usability of the CITation ANalysis package for R statistical computing environment, is shown. The main aim of the software is to support bibliometricians with a tool for preprocessing and cleaning bibliographic data retrieved from SciVerse Scopus and for calculating the most popular indices of scientific impact.

https://cran.r-project.org/web/packages/CITAN/index.html

https://cran.r-project.org/web/packages/CITAN/CITAN.pdf

https://github.com/gagolews/CITAN

https://www.gagolewski.com/publications/2011citan.pdf

library(CITAN)
## Loading required package: agop
## Loading required package: RSQLite
## Loading required package: RGtk2

Sin embargo, la función Scopus_ReadCSV() produce un error en Windows. Para corregirlo:

# Session > Set Working Directory > To Source...
source("Scopus_ReadCSV2.R")

Creación de la base de datos

Se generará el archivo:

dbfilename <- "UDC2015.db"

Primera ejecución: Creación del modelo de DB

Creación del archivo de BD vacío:

conn <- lbsConnect(dbfilename)
## Warning in lbsConnect(dbfilename): Your Local Bibliometric Storage is
## empty. Use lbsCreate(...) to establish one.

Creación del esquema con lbsCreate():

lbsCreate(conn) 
## Warning: RSQLite::dbGetInfo() is deprecated: please use individual metadata
## functions instead

## Creating table 'Biblio_Categories'... Done.
## Creating table 'Biblio_Sources'... Done.
## Creating index for 'Biblio_Sources'... Done.
## Creating table 'Biblio_SourcesCategories'... Done.
## Creating table 'Biblio_Documents'... Done.
## Creating table 'Biblio_Citations'... Done.
## Creating table 'Biblio_Surveys'... Done.
## Creating table 'Biblio_DocumentsSurveys'... Done.
## Creating table 'Biblio_Authors'... Done.
## Creating table 'Biblio_AuthorsDocuments'... Done.
## Creating view 'ViewBiblio_DocumentsSurveys'... Done.
## Creating view 'ViewBiblio_DocumentsCategories'... Done.
## Your Local Bibliometric Storage has been created.
##    Perhaps now you may wish to use Scopus_ImportSources(...) to import source information.

## [1] TRUE

Importar información de Scopus (descargada previamente…) con la función Scopus_ImportSources() (código):

Scopus_ImportSources(conn) # Cuidado con el tiempo de CPU...
## Importing Scopus ASJC codes... Done, 334 records added.
## Importing Scopus source list...

## Warning in doTryCatch(return(expr), name, parentenv, handler): No ASJC @
## row=510.

## Warnings... __TRUNCATED__

## Done, 30787 of 30794 records added; 55297 ASJC codes processed.
## Note: 7 records omitted @ rows=13847,15526,16606,17371,19418,24419,29365.

## [1] TRUE

Incorporar nuevos datos

Con la función Scopus_ReadCSV() se produce un error en Windows:

data <-  Scopus_ReadCSV("udc_2015.csv")
## Error in Scopus_ReadCSV("udc_2015.csv") : Column not found: `Source'.

Empleando la versión modificada:

data <-  Scopus_ReadCSV2("udc_2015.csv")

Añadir los documentos a la base de datos:

lbsImportDocuments(conn, data) 
## Importing documents and their authors... Importing 1324 authors... 1324 new authors added.

## Warning in .lbsImportDocuments_Add_Get_idSource(conn, record$SourceTitle, :
## no source with sourceTitle=''Quaternary Science Reviews'' found for record
## 10. Setting IdSource=NA.

## Warnings... __TRUNCATED__

## Done, 363 of 363 new records added to Default survey/udc_2015.csv.

## [1] TRUE

Se podría añadir una descripción para trabajar con distintos grupos de documentos:

lbsImportDocuments(conn, data, "udc_2015") 

Extraer información de la BD

En siguientes ejecuciones bastará con conectar con la BD

conn <- lbsConnect(dbfilename)

Estadísticos descriptivos

lbsDescriptiveStats(conn)
## Warning: RSQLite::dbGetInfo() is deprecated: please use individual metadata
## functions instead
## Number of sources in your LBS:           30787
## Number of documents in your LBS:         363
## Number of author records in your LBS:    1324
## Number of author groups in your LBS:     1
## Number of ungrouped authors in your LBS: 1324
## 
## You have chosen the following data restrictions:
##  Survey:         <ALL>.
##  Document types: <ALL>.
## 
## Surveys:
##   surveyDescription DocumentCount
## 1    Default survey           363
##   * Note that a document may belong to many surveys/files.

## Document types:
## 
##  ar  cp  ip  re  le  no  sh  er 
## 256  52  24  15   6   2   2   1

## Publication years:
## 
## 2014 2015 2016 
##    1  354    8

## Citations per document:
## 
##   0   1   2   3   4   5   6   7   9  10  11  12  15 
## 223  73  25  17  14   2   1   2   1   2   1   1   1

## Categories of documents:
##          Economics, Econometrics and Finance(all) 
##                                                 8 
##                                  Engineering(all) 
##                                                45 
##                          Arts and Humanities(all) 
##                                                 9 
##                                     Medicine(all) 
##                                                43 
##                         Chemical Engineering(all) 
##                                                10 
##                             Computer Science(all) 
##                                                35 
##   Pharmacology, Toxicology and Pharmaceutics(all) 
##                                                10 
##                                             Other 
##                                                33 
##                            Materials Science(all) 
##                                                22 
##         Agricultural and Biological Sciences(all) 
##                                                27 
##                                  Mathematics(all) 
##                                                30 
## Biochemistry, Genetics and Molecular Biology(all) 
##                                                32 
##                        Environmental Science(all) 
##                                                32 
##                              Social Sciences(all) 
##                                                32 
##                        Physics and Astronomy(all) 
##                                                19 
##                                       Energy(all) 
##                                                10 
##          Business, Management and Accounting(all) 
##                                                10 
##                                   Psychology(all) 
##                                                 8 
##                                    Chemistry(all) 
##                                                51

## Documents per author:
## 
##    1    2    3    4    5    6    7    8    9   10   11   13   16 
## 1000  224   46   18   12    8    6    1    1    2    2    1    1

Otra información

Se puede obtener información acerca de los documentos producidos y las citas recibidas correspondientes a cada autor:

citseq <- lbsGetCitations(conn)
## Data set restrictions:
##  Survey:         <ALL>.
##  Document types: <ALL>.
## 
## Creating citation sequences... OK, 1322 of 1322 records read.
# citseq <- lbsGetCitations(conn, surveyDescription="udc_2015")

Número de autores

length(citseq) 
## [1] 1322
head(names(citseq))
## [1] "LÓPEZ-GARCÍA X."   "MARWAH S."         "OTERO T.P."       
## [4] "IGLESIAS M.P."     "GONZÁLEZ-RIVAS D." "BARROS CASTRO J."
citseq[[4]]
## 229  11 
##   1   0 
## attr(,"IdAuthor")
## [1] 4

Se pueden seleccionar autores:

id <- lbsSearchAuthors(conn, c("Cao R.", "Naya S.", "Naya-Fernandez S."))
id
## [1]   46 1193

Obtener las citas de los trabajos de los autores seleccionados:

citseq2 <- lbsGetCitations(conn, idAuthors=id)
## Data set restrictions:
##  Survey:         <ALL>.
##  Document types: <ALL>.
## 
## Creating citation sequences... OK, 2 of 2 records read.
length(citseq2)
## [1] 2

Obtener los documentos relativos a los autores seleccionados:

id_re  <-  lbsSearchDocuments(conn, idAuthors=id)

Obtener información acerca de los documentos:

info_re <- lbsGetInfoDocuments(conn, id_re)
info_re
## [[1]]
## IdDocument:    16
## AlternativeId: 2-s2.0-84947552209
## Title:         Lifetime estimation applying a kinetic model based on the generalized logistic function to biopolymers
## BibEntry:      Journal of Thermal Analysis and Calorimetry,2015,122,3,,1203,1212
## Year:          2015
## Type:          Article
## Citations:     0
## Authors:       NAYA S./46/NA, ÁLVAREZ A./518/NA, LÓPEZ-BECEIRO J./565/NA, GARCÍA-PARDO S./624/NA, TARRÍO-SAAVEDRA J./631/NA, QUINTANA-PITA S./709/NA, GARCÍA-SABÁN F.J./978/NA
## 
## [[2]]
## IdDocument:    98
## AlternativeId: 2-s2.0-84928890357
## Title:         Bootstrap testing for cross-correlation under low firing activity
## BibEntry:      Journal of Computational Neuroscience,2015,38,3,,577,587
## Year:          2015
## Type:          Article
## Citations:     1
## Authors:       ESPINOSA N./779/NA, MARIÑO J./832/NA, CUDEIRO J./1096/NA, CAO R./1193/NA, GONZÁLEZ-MONTORO A.M./1294/NA
## 
## [[3]]
## IdDocument:    127
## AlternativeId: 2-s2.0-84939982743
## Title:         Classification of wood using differential thermogravimetric analysis
## BibEntry:      Journal of Thermal Analysis and Calorimetry,2015,120,1,,541,551
## Year:          2015
## Type:          Article
## Citations:     0
## Authors:       NAYA S./46/NA, LÓPEZ-BECEIRO J./565/NA, TARRÍO-SAAVEDRA J./631/NA, FRANCISCO-FERNÁNDEZ M./766/NA, ARTIAGA R./1112/NA

Obtener las citas de cada documento:

cit_re  <-  sapply(info_re,  function(x)  x$Citations)
cit_re
## [1] 0 1 0

Cerrar conexión

lbsDisconnect(conn)