Visualizing the Impact of Covid-19 on Tourist Arrivals in Cambodia

Line Chart
Intermediate

In previous recipes, we have worked with COVID-19 case and death data and created some interactive visualizations. This time, we are moving our attention to the socioeconomic impacts of the pandemic. The COVID-19 pandemic has imposed travel restrictions and social distancing measures, which have led to a disruption in employment and impacted livelihoods worldwide. While all economic sectors have been affected by the pandemic, tourism is one of the hardest-hit. The United Nations estimated that over 100 million direct tourism jobs are at risk. This post will analyze the socioeconomic impacts of COVID-19 on Cambodia's tourism sector and prepare a relevant visualization. 

This recipe was produced with the generous support of the Institute for War and Peace Reporting.

Stats

Ingredients
Cambodian Tourism Statistics Report
Tools
Flourish; Tabula; Microsoft Excel
Read in other Languages

Introduction

According to the World Health Organization, Cambodia has reported 363 cases with no deaths as of December 22, 2020. Despite its apparent success in fighting against the spread of COVID-19, Cambodia has experienced significant negative effects in its key economic sectors: garments, construction, hotels and restaurants, transportation and communication, and agriculture. The impact on these sectors has substantial socioeconomic implications in this developing country. 

1

Understanding the Situation

In this recipe, we will examine the socioeconomic impacts triggered by the collapse of the tourism sector. Among the various economic sectors in Cambodia, tourism is noteworthy to explore because: 

  1. Tourism data is official and accurate: There are disaggregated datasets for Cambodia’s tourism sector. Governments in many countries are usually more transparent with publishing data on tourism compared to other sectors of the economy. Every year, Cambodia's Ministry of Tourism issues a "Tourism Statistics Report," which provides tourism data such as the number of international tourist arrivals and tourism expenditure.
  2. Tourism data is up-to-date: We can see the decline in the number of tourists once the pandemic started. If you had chosen to illustrate the impact of COVID-19 in other economic sectors, the effects of COVID-19 may be lagging and may only be evident in the data that are reported months after the pandemic started. For example, the pandemic may hurt the construction sector, but we may not observe drastic changes this year. Instead, the number of construction permits may decline only in the following year. You can make an interesting story of these types of socioeconomic impacts, but it may be hard to turn it into a timely data story due to the lagging effects. 
  3. Tourism data is easy to understand: People can easily observe the effect of COVID-19 on tourism compared to any other sectors. For some people, the pandemic may disrupt their travel plans. But for others, their livelihoods relying on the tourism sector will be at risk. Undoubtedly, tourism is directly related to people's lifestyle or livelihoods. Indicators used to measure tourism--such as people (tourists), money (expenditure, receipts), time (duration of stay and travel), and space (distance, length of trips)--are simple, tangible, and easy to visualize. On the other hand, indicators in other sectors such as manufacturing output, investment, or export value may be abstract for some readers to be aware of their socioeconomic impacts. 
  4. Tourism is vital for Cambodia's economy: The tourism sector has been critical in driving Cambodia's economic expansion in the past decades. The Cambodian government prioritizes tourism as a strategic sector for the economy which millions of livelihoods depend on. In 2019, Cambodia earned $4.92 billion through tourism, which is nearly 18.7 percent of its GDP. 

Travel restrictions vastly jeopardize the tourism sector, and Cambodia became one of the top twenty countries with a significant decline in the number of tourists. The following chart shows the number of international tourist arrivals in Cambodia, which has declined in 2020 due to the global pandemic. Between January and June 2020, Cambodia was able to retain just 1.24 million foreign visitors, which is 74 percent lower than the number of tourist arrivals during the same months in 2019.


In this recipe, we will work with the data on international tourist arrivals to Cambodia from 2015 to 2020 and create an animated visualization as shown below:


2

Preparing the Data

2.1) Gathering and Verifying the Data

First, we need to obtain data on international tourist arrivals to Cambodia from a reliable source. It is a best practice to get data from a primary source. A primary source provides direct or first-hand testimony or evidence. In our context, this would be Cambodia’s Ministry of Tourism.

Unfortunately, the Ministry’s website does not have the most recent report. A Google search however will lead you to the Cambodian Tourism Statistics Report as of September 2020 on the website of NagaCorp, a company that operates an integrated resort in Cambodia’s capital, Phnom Penh, which comprises one of the country’s largest luxury hotels and a popular casino. Since NagaCorp is not the official tourism authority, it is advised to check whether their report is reliable. In comparing the September 2020 report to the other reports posted on the Ministry’s website, you will see that the formats are somewhat identical, including the official logo of the Ministry on the cover page. Thus, we may assume that the reports posted on NagaCorp have been authored and verified by Cambodia’s Ministry of Tourism.

The table below is from the aforementioned report. It shows the monthly international tourist arrivals to Cambodia between 2015 and 2020. We will use this data table to create an animated chart. 

One trick that we can use here to look for additional data is to slightly modify the URL to access different files on the website. If you examine the URL for the PDF report, you’ll see that it ends with “/tourism_statistics_202009.pdf”. The “202009” at the end hints that this particular file is for the September (month 09) of the year 2020. If we would like to find reports that correspond to other months, we might be able to access them by changing “202009” to some other “YYYYMM” value. For instance the report for January of 2020 can be accessed at this URL: https://www.nagacorp.com/eng/ir/tourism/tourism_statistics_202001.pdf
This is one common technique that can be used to search for datasets within a particular website, if you can guess from the pattern of the URLs that there might be additional files with similarly formatted names that are available on the site. For this recipe, we will just be using data from the September 2020 report. 

You can find this table on Page-2 along with the graph showing the tourism arrival trends for each year. This graph is informational, but we will be making an animated version that can more effectively deliver the impact of COVID-19 on Cambodia’s tourism sector in 2020. 

A challenging aspect in this step is that the data table is in a PDF format. To analyze the data and create the visualization, we need to follow extra steps to export this table into a processable format that is machine-readable and structured. 

We will use a free and open source tool called Tabula to extract this dataset from PDF to Excel format. 

 

2.2) Installing Tabula

You will find data tables in PDF files that cannot be easily copied or imported into Excel or Word. This section will show you how to extract these data tables from PDF files using a free and time-saving tool called Tabula. This tool works well in most PDF files with black and white data tables and does not require an internet connection. 

Before installing Tabula, you need to ensure if Java is installed on your computer. If you already have Java, you may skip ahead to the installation steps for Tabula.

  • Check if you already have Java by typing in it in the Apps & features in Windows Settings. 


Once Java is installed, you are ready to start the installation process for Tabula. 



  • You will see the instructions on this page. Follow these steps to have Tablua installed on your computer. 

2.3) Converting from PDF to Excel

Now that Tabula has been installed, we can start working on the conversion process from PDF to Excel. 


  • Go to the “Tabula” folder and run the tabula.exe program. 
  • A control window may open. Allow this window to run. 
  • A web browser will pop up. This is where we will convert from PDF to Excel format. If the browser does not open, use your web browser to go to http://localhost:8080
  • Click Browse and upload the PDF file you have previously downloaded from this page. 
  • Click Import


After uploading the file, you can see the file name under the Imported PDFs category. 

  • Click the Extract Data button. 

The PDF file will pop up. You will see that you can now select the tables in this pdf document.  

  • Go to page-2. There, you will find the table for International Tourist Arrivals to Cambodia.


Here, we will select the area of the table we want to convert. You should carefully select to include only those necessary columns and rows of the table. Do not expand too much to include text that is not part of the table, such as the table titles or notes. 

  • Select the area of the table, as shown in the screenshot below. Be sure that the selection strictly covers monthly (Jan-Dec) and yearly (2015-2020) data. 
  • After the selection, click the Preview & Export Extracted Data button on the panel. 


A window appears that displays the preview of the extracted data in a structured, machine-readable format. Make sure that the data looks correct as shown below.

In case the table is not in the right format, you can click the Revise selection(s) to repeat the selection. If it still appears to be a bit off, you may have to edit manually as appropriate.

There are two extraction methods to choose from: Stream and Lattice. Stream extraction method is used for tables where rows and columns are separated by blank space. Lattice works better for tables where rows and columns are separated by lines. You can see that Tablua automatically selects the Stream extraction method for this table. 

Now, we are ready to export the dataset. 

  • From the Export Format drop-down, select CSV. 
  • Then Click the Export button.

A CSV file named “tabula-tourism_statistics_202009” will be exported. 

  • Open this CSV file to view the table.




2.4) Cleaning the Dataset

The data table is now in processable format. We can start the data cleaning process for the visualization part. 

In the table, quarterly tourist arrival data (Q1, Q2, Q3, Q4) and total numbers are not required. We will delete these rows. 

  • Select the rows for Q1, Q2, Q3, Q4 and Total. (These are rows 2, 6, 10, 14, and 18.)
  • Right-click > select Delete


The 2 screenshots below are using Excel, but you can also follow the same steps on Google Sheets to get the same results.

Your final cleaned data table should look like this:


3

Creating an Animated Chart in Flourish

We have completed the data cleaning process. Let’s proceed to the most exciting part: creating the visualization! We will use Flourish to make an animated chart. 

  • Visit Flourish website at https://flourish.studio/
  • Click Get started for free to create an account (assuming you don’t have one). 

After you have signed up, a Project page will appear. 

  • Click the button + New visualization

You will be directed to the Template page. Here, you can make a selection from an array of charts, visualization techniques, and functions. 


3.1) Choosing the Chart Template and Exporting Data Table

We want to create a line chart race in which monthly tourist arrival data for different years are compared. Let’s select the chart type based on what we want. 

  • Scroll down to the Line chart race category. 
  • Select the Simple chart. 

A visualization page with the running line chart race will appear: 

This is an initial preview of your chart. To customize our chart, we need to upload the “International Tourist Arrivals to Cambodia” data.  

  • Click the Data view. 
  • Click Upload Data

  • Locate the tabula-tourism_statistics_202009.csv file on your computer and upload it. 
  • A window will pop up and it will ask you to choose whether to Import publicly or Go private. Since we are using the free version of Flourish, we will go with Import publicly

It will notify you with the number of uploaded rows . 

  • Click Next, select the columns

Now, the datatable has been uploaded. 

  • Name this project “International Tourist Arrivals to Cambodia”. 

In this graph, each running line represents the number of international tourist arrivals to Cambodia for each year (2015, 2016, 2017, 2018, 2019, 2020) from January to December. For a racing line chart on Flourish, each row of data has to represent each year. We then need to transpose the rows and columns. 

  • Click on the right-angled arrow on the top leftmost cell. This arrow will transpose the rows and the columns.

Afterwards, we need to make selections for the Name column and Score column

Name column is where the racing lines (years) will be. Score columns show the values of these competing lines (tourist arrivals in these months). 

  • Type in A for the Name column
  • Type in B-M for the Score columns

Now. let’s get back to Preview. There, you can see the racing lines for each year. 

The graph may look odd because it is based on Ranks. We will work on each of the elements on the right to make the animated graph look professional. 




3.2) Designing and Formatting the Chart Elements

On the right of the page, you can see a column listing different chart elements. We will work on each of these elements to make the chart more intuitive and attractive. Although we recommend certain selections, you can play around with these options to explore different possibilities. 


First we want to make the graph look like below which requires adjustments in View and Scoring type sections. 

View 

We can choose how to animate this chart. Since we want to show the whole picture, 

  • Select Show all for Play mode


Scoring type 

In the chart, we want to show the actual number of tourist arrivals each year instead of ranking them. 

  • Select Scores for Chart mode to show on load
  • Select Higher scores win for Data type
  • Select Competition for Rank ties mode

Next, we want to make the chart organized as shown below. Do you detect any changes? Yes, we made some changes in size, control, and colors




Chart sizing

We need to make this chart in a portable size so that it can be conveniently embedded anywhere. Let’s change its size. 

  • For Height mode, select Match data. 

We can add some margins to the graphs by typing in these values in the respective boxes. 

  • Right:  4
  • Bottom:  1
  • Left:  4
  • Right (mobile):  1


Controls 

Animated graphs usually include a Replay button which allows users to repeat the animation if they desire.

  • For the Ranks/scores toggle, choose Hidden since it is unnecessary. 
  • For the Replay button, choose Visible. 
  • We can rename the Replay button if we would like. We will just leave it as Replay in the text box.  

We can make this Replay button look even better by setting as below: 

  • Text size: 1
  • Text weight: Bold
  • Height: 0.4

Leave the options for Button Group Styles as default, as shown below. 


Colors 

Color is an essential element of this visualization. Flourish provides the default palettes. But we will customize the color palette for all the lines because we want to make the line colors for 2020 distinctive. 

  • Let’s type in these hex values for Custom overrides. You may copy and paste these values in the text box as well. 

2020: #ff9966

2019: #ffbf59

2018: #7a8099

2017: #c2b0af

2016: #99cccc

2015: #339999



Now we have made some significant changes in the map elements. However, we need to make changes in the lines and labels cleaner and easier to read. 


Line styles 

It is better to have thinner lines, especially when they overlap. 

  • Reshape the line by selecting options as below: 

Line width: 0.1

Opacity: 1

Curve: Straight

Shading behind line: Off


Circle styles

The circles, which mark the beginning and end of the trend lines, are important visual elements of the graph. 

  • We will set them in an appropriate size and shape: 

Start radius: 0.4

End radius: 0.4

End stroke: 0.4

Space between: 4

Stroke color: Background

Image inside circle: Off


Label styles 

We will resize the label by selecting 

  • Rank font size as 1 
  • Label font size as 1.5
  • Label color as Auto 
  • Rank position as Outside 
  • Show label as Always 
  • Decimal places as 0. 


In the upcoming step, we will change labels in the X-axis and Y-axis to be more organized with improved visibility. 



Y axis 

We will make some subtle changes to the default values of the Y-axis. 

  • Keep the default settings for Label color and line color. 
  • For Label size, type in 1. 
  • For Label dash, type in 1. 
  • For Zooming, Turn off Dynamic Y axis

For the Y-axis values, we would like to start it from zero. 

  • Write in 0 for the Min score
  • Leave blank for Max score

In the Number Styling, 

  • Leave Prefix and Suffix blank. Delete any values in these boxes. 
  • Decimal places should be 0. 


X axis

In the X-axis, 

  • Leave Label color as it is. 
  • Set Label size as 1.2.
  • Choose 0 for Text angle
  • Turn off button for Show hidden labels on hover


Animation 

We can adjust the speed of these racing lines. Let’s reset the animation and mode duration: 

Animation duration: 1200

Mode duration: 300



Number formatting

The number formatting can be set to default as below: 

Decimal separator in data sheet: .  (decimal point)

Number format to display: 12,235.67


Layout 

In this section, you can change the fonts used, background color, size and structure of this graph. 

First we will work on the font: 

  • Main font: Source Sans Pro 
  • Text Color: Black


  • For the Background,  
  • Color: On
  • Image: Off
  • Background color: White
  • For Maximum width, 
  • select None. 
  • For the Layout Order, select the third option. 
  • For the Space between sections, select the third option.
  • For Margins, all sides (Top, Right, Bottom, Left) should be 1. 
  • Turn off Show borders around visualization.


We are close to the finishing line. We will add a few information in the header and footer as final elements. 

Header 

For the Header, 

  • Select Left for the Alignment. 
  • Title is International Tourist Arrivals to Cambodia. 
  • Subtitle is 2015-2020. 

Make sure that Change title style and Change Subtitle styles are turned off. 

We will also add some text which briefly explains what this chart conveys. 

  • Type in the Text: The number of tourists entering the country has declined tremendously in 2020 because of the COVID-19 pandemic.  
  • Styling is off. 

And we will make changes for the Border:

  • Let’s place the border on the Top. 
  • The width should be 1. 
  • Leave the default color. 
  • Choose the Dotted for Style
  • Type in 1 for Space. 
  • LOGO/IMAGE is disabled.  


Footer

For the Footer, 

  • Select Justify for the Alignment. 
  • The Size is 1. 
  • Source name: Cambodia: Tourism Statistics Report 
  • Source URL: https://www.nagacorp.com/eng/ir/tourism/tourism_statistics_202009.pdf
  • Source Label: Source: 
  • For Logo, Image is Disabled. 
  • Select None for Border


Now we can export and publish our interactive chart. 


  • Click Export and publish > then Publish to share and embed



In the prompted window, click Publish

You can now embed this animated chart in your website using the link provided. 



4

Analyzing Socioeconomic Impacts

4.1) Socioeconomic Implications 


The decline in tourist arrivals has resulted in devastating socioeconomic impacts for a country like Cambodia which relies substantially on tourism for its national revenue. This decline could result in the loss of approximately 3 billion US dollars in tourism revenue and leave 110,000 workers in the sector at risk. Simultaneously, this could worsen social inequality in the country because the tourism sector largely employs women. Providing opportunities for women at different occupation levels, tourism in Cambodia has been considered an important sector in promoting gender equality. Because of the pandemic, the country’s continued success in promoting gender equality may fall short.

You can see how different recent news articles craft stories on tourism in the links below: 

Thailand Eases Curbs on Foreign Tourists Before Peak Travel
Virus cuts Egypt tourist revenues to $4bn
Japan tourism faces 80% drop as coronavirus threatens Abenomics
COVID-19 tourism spend recovery in numbers


4.2) Navigating Newsworthy Issues

At a global scale, the COVID-19 pandemic creates hardship in livelihoods in every sector and leaves negative consequences across the economic, environmental, human, social, political, and security dimensions. Accessing and reporting on the pandemic's impacts on societies, economies, and vulnerable communities become more and more important. The United Nations Development Programmes (UNDP) has prepared such assessment reports. These resources may help come up with interesting data stories for the socioeconomic impacts of COVID-19.  

As we move into the new normal world with heightened uncertainty, understanding socioeconomic impacts has become critical in formulating effective policies and targeting limited resources. Extensive and in-depth studies on different social and economic issues will emerge over time. With the growing necessity of COVID-19 data stories in public awareness and policy dialogue, Thibi Recipes will continue to guide data journalists with efficient approaches in crafting compelling data stories with comprehensive guidelines in data preparation, analysis, and visualization.


Recommended Recipes

Drawing Line Charts with Covid-19 Case Data

Line Chart
Beginner

Drawing a Bar Chart with Covid-19 Case Data

Bar Chart
Beginner