naijR 0.2.2

This post is to announce the arrival of naijR 0.2.2 on CRAN.

New S3 classes

This version of the package introduces the use of an object-oriented style to programming, making available constructors for states and lgas objects. To create instances of both classes, we pass a character vector of States or LGAs as appropriate. These constructors are somewhat permissive and do not perform strict accuracy checks. For that we have the functions is_state and is_lga.

Check for LGAs

There are 774 LGAs in the country and they are pivotal to any analytic tasks done with country data. They are also very often misspelt as any dataset taken from the wild would reveal. I have taken the pains to provide authoritative appellation for this tier of governance using government sources. This can be easily inspected in the inbuilt package dataset lgas_nigeria.

The new function is_lga will scan through a vector to check whether its elements have correctly spelt LGAs. Where poorly spelt ones are found, the function fix_region can be used to correct this. The method for lgas objects will attempt to do this automatically using partial matching. For example

> library(naijR)
> mylga <- c("Amuwo-Odofin", "Bukuru", "Askira-Uba")
> is_lga(mylga)
[1] TRUE FALSE FALSE

Fix mispelt regions

A major addition in the current version is the function fix_region, which helps a user to repair any misspelt adminstrative regions within a dataset. The function has methods for different kinds of regions i.e. States and Local Government Areas, which are optionally represented as the S3 objects states and lgas, respectively. However, the function also has a method for base character vectors, mainly for States, since they are not that many. To repair our vector my_lga, we will create an lgas object first and then pass it as an argument to fix_region.

> fixed <- fix_region(lgas(mylga))
Approximate match(es) not found for the following:
* Bukuru
Warning message:
In lgas(mylga) : One or more elements is not an LGA
> fixed
[1] "Amuwo-Odofin" "Bukuru"       "Askira/Uba"

The LGA Askira-Uba has been corrected to its correct spelling, Askira/Uba. However, a match could not be found for the element Bukuru. (Bukuru is actually the name of the headquarters of Jos South LGA of Plateau State). To continue attempting to repair our vector, we run fix_region in interactive mode

> fixed <- fix_region(lgas(mylga), interactive = TRUE)
Approximate match(es) not found for the following:
* Bukuru
Do you want to repair interactively? (Y/N): 

The user is prompted to continue interactively. To continue enter something like y.

Fixing ‘Bukuru’
Search pattern: buk
Select the LGA 
1: Bukkuyum
2: Retry
3: Skip
4: Quit

We are searching for options using the search term buk and only one option was returned i.e. Bukkuyum. Unfortunately, that’s not the one we are looking for so we will enter 2 and run the search again, by passing only bu

Selection: 2
Search pattern: bu
Select the LGA 
 1: Buruku                         2: Akpabuyo                    
 3: Obubra                         4: Obudu                       
 5: Burutu                         6: Abuja Municipal Area Council
 7: Babura                         8: Buji                        
 9: Bunkure                       10: Sabuwa                      
11: Bunza                         12: Kabba/Bunu                  
13: Ijebu East                    14: Ijebu North                 
15: Ijebu North East              16: Ijebu Ode                   
17: Abua/Odual                    18: Tambuwal                    
19: Bursari                       20: Bukkuyum                    
21: Bungudu                       22: Retry                       
23: Skip                          24: Quit      
                  

The LGA I wanted to select was Buruku, so I pick option 1

Selection: 1
Warning message:
In lgas(mylga) : One or more elements is not an LGA
> fixed
[1] "Amuwo-Odofin" "Buruku"       "Askira/Uba"  
> is_lga(fixed)
[1] TRUE TRUE TRUE

We’ve fixed the LGAs! At this point, any LGAs that could not be fixed can be treated be directly manipulation of the object,

Maps

This version of the package provides increased granularity for the Nigeria country map, currently going down to LGA levels.

map_ng(lgas())

To know more about drawing Nigeria maps with the package, see the documentation (?map_ng) or read the vignette.

Conclusion

This version of naijR brings some new functionality to aid with data cleaning and validation of LGA names, as well as LGA level mapping. I would like you to try it out and give me some feedback.

Leave a comment

Filed under Data Scoemce

RQDAassist v.0.3.1

This is to announce a new version of the R package RQDAassist, a package whose goal is to make working with RQDA much easier.

This version principally adds new functionality in the retrieval of codings from a project database. The function takes as arguments the file path to an RQDA project and a string containing a valid SQL query (SQLite flavour). As a default, one does not need to specify the query. The function does this internally to fetch data from relevant tables in the .rqda file. Thus, for a project MyProject.rqda, one can simply call

retrieve_codingtable("path/to/MyProject.rqda")

The default query that is run internally by this function is as follows:

SELECT treecode.cid AS cid, codecat.name AS codecat
FROM treecode, codecat
WHERE treecode.catid=codecat.catid AND codecat.status=1;

The user is at liberty to form their own queries; a reference for the database tables is in the RQDA package and the documentation for this function (accessed with ?retrieve_codingtable) provides a quick link to that help page. For example, if we want to just collect the filenames of the transcripts used in an analysis, we can use a different query. Note that the data are returned invisibly, to prevent cluttering of the console, so it’s better to bind it to a variable.

qry <- "SELECT DISTINCT name FROM source WHERE status=1;"
tbl <- retrieve_codingtable("path/to/MyProject.rqda", qry)
tbl

We can easily try this out using material from the excellent RQDA workshop conducted by Lindsey Varner and team. We can download the sample project they used right inside R:

url <- "http://comm.eval.org/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=101e221b-297e-4468-bfc9-8deccb4adf8c&forceDialog=0"
project <- "MyProject.rqda"
download.file(url, project, mode = 'wb')

If we check the working directory with list.files, we should see the project there. Next, using our package’s function, we can get a data frame with information on codings.

> df <- retrieve_codingtable(project)
> str(df)
Classes ‘codingTable’ and 'data.frame':	39 obs. of  9 variables:
 $ rowid       : int  1 2 3 4 6 9 10 12 13 14 ...
 $ cid         : int  1 1 2 1 2 4 4 1 4 3 ...
 $ fid         : int  2 2 2 2 2 2 3 4 4 4 ...
 $ codename    : chr  "Improved Time Management" "Improved Time Management" "Improved Organization" "Improved Time Management" ...
 $ filename    : chr  "AEA2012 - Post-Program Interview1" "AEA2012 - Post-Program Interview1" "AEA2012 - Post-Program Interview1" "AEA2012 - Post-Program Interview1" ...
 $ index1      : num  1314 1398 1688 1087 2920 ...
 $ index2      : num  1397 1687 1765 1175 2964 ...
 $ CodingLength: num  83 289 77 88 44 296 120 150 210 116 ...
 $ codecat     : chr  "Positive Outcomes" "Positive Outcomes" "Positive Outcomes" "Positive Outcomes" ...

We see that we now created a data frame with 9 columns, with interesting data in them. Note particularly the variables codename, filename, and codecat. Let us now carry out the other query we gave as an example – to get the filenames of all the transcripts in the project:

> qry <- "SELECT DISTINCT name FROM source WHERE status=1;"
> tbl <- retrieve_codingtable(project, qry)
> tbl
                                name
1  AEA2012 - Post-Program Interview1
2  AEA2012 - Post-Program Interview2
3  AEA2012 - Post-Program Focus Group
4  AEA2012 - Pre-Program Focus Group

This project contains only 4 active files from which all the codings are derived!

A practical point

This function is useful for developing qualitative codebooks, and particularly when coding is carried out inductively and as has been demonstrated, can be extended to other uses, depending on the kind of data that are retrieved.

Installation

The easiest way to install the package is from an R session with

# install.packages("remotes")
remotes::install_github("BroVic/RQDAassist")

This is a source package, and to build it on Wiindows, Rtools needs to have been properly installed.

Leave a comment

Filed under Computers & Internet, Data Scoemce

An R package to help with RQDA

A few weeks aga, I published a package on GitHub, which I called RQDAassist. The package was inspired by a script I wrote to help RQDA users, myself included, to install the package after it was archived on CRAN when R 4.0 arrived on the scene. So, when RQDAassist was first published, that was its only real functionality.

Today, I am releasing a minor update (v. 0.2.0) that has a few functions added. It can now convert transcripts written in Word into plain text files – a desired format for RQDA projects – and it can prepare those test files into objects that can be read, in bulk, into an RQDA database. Another thing I personally needed for my work was the ability to seaarch qualitative codes using R scripts rather than the graphics user interface; so I wrote a search function, which currently works for active RQDA projects.

This package has so far been tested on Windows 10 (x64) but it should work fine on other major platforms (any subequent update will include the relevant tests for Linux and Mac OS).

There are no plans to take this package to CRAN and indeed there should be no need to do so once RQDA installation from that repository is fully restored. But I find the prospect of additional helper functions to be quite useful in my work and hope others do too. I hope to see these functionalities expand over time.

You are welcome to check out this project at the GitHub repository or try it out using the instructions in the README.

Leave a comment

Filed under Computers & Internet

Newbies also contribute to open source!

Programmer at work
By Crew crew – https://unsplash.com/photos/4Hg8LH9HoxcImageGallery, CC0, https://commons.wikimedia.org/w/index.php?curid=61684334

As a starter in programming, once one encounters the world of “open source”, it can be daunting, if not impossible to contribute to projects. Of course, you’re just starting out and can barely construct a working program in the language you are currently learning.

So, do I have to wait until I am proficient or an expert in my favourite programming language, before I can contribute to an open source project? How can I be a active in the community, and not an onlooker, from the very start?

Easy. Documentation.

I don’t know about others but from my experience, software documentation is often lacking in quantity and quality. I guess because programmers are focused first and foremost on developing working programs, the documentation, manuals, help files, etc. end up having quite a few mistakes, errors and inconsistencies.

So, if you’re new to programming, you may not be able to immediately submit code to that project, but you can always help to improve the documentation. I assure you, this is one area where you can really really make yourself useful, and distinguish yourself as one who brings some value to the table. ‘Cos the documentation is a very important part of any good project.

So, dig in. Clone that GitHub repository and fix any problems you find in the manual.

(Fix)TFM.

Leave a comment

Filed under Uncategorized

I uninstalled the Twitter app

Twitter is a sinking ship.

Honestly I’m sick of it. The toxicity. The lies. The biases. The censorship. What started as a fun platform has turned into a daily, waking-hours nightmare.

I remember how I started out on Twitter, back in 2009. At the time, I and one of my friends on MySpace, who was an aspiring model, continued sharing our thoughts on the site. At the time, she wasn’t too sure of her looks and I assured her that she had what it takes to make a good career. And she did make it big time. But she’s since been suspended on Twitter — maybe for showing too much flesh. I won’t mention her name for obvious reasons.

Twitter has not been a very positive experience for me in 2020. The role it has played in silencing valid dissenting medical opinion on the COVID-19 response is what I found most repulsive. I am particularly offended by their censorship of tweets about valid research that do not fit a certain narrative.

The deliberate suppression of tweets on damning information on one of the U.S. presidential candidates is also unforgivable.

Frankly, I’m done. I’ve decided to pull back, first by removing the Twitter mobile app. I will remain active on the platform but on a more impersonal note. I don’t think the site can survive too long anyway. There is no trust anymore and even the beneficiaries of its antics know this.

I remember how we used to complain about porn and terror on Twitter. Nothing much was ever done about it – basically it boiled down to free speech and we just decided to live with it. “Face your tribe and ignore the stuff you don’t like” was the approach we followed. Nowadays, the woke brigade at Twitter will flag a tweet that says “only females can have cervical cancer”. Balderdash!

For me it’s time to scale down. Thank God I don’t have a million followers, so it’s going to be easy to disappear altogether, soon.

2 Comments

Filed under Computers & Internet