How to contribute

Adding content

If you are interested in making a contribution to this book, the first step is to install and configure GitHub Desktop. Then, use the quick-start links provided below.

We made an example module to help with any new content. It will be under template-module in the repo and the example content page can be found here. Feel free to make a copy and use it for your own module!

To learn more about how the book is organized, see our organization section. To learn about and see some AsciiDoc examples, see our AsciiDoc section.

Adding an example

This example assumes:

  1. the-examples-book repository is located on your local machine at /Users/myusername/projects/the-examples-book.

  2. You’ve installed and configured GitHub Desktop, and your Current Repository (in the upper left-hand corner) is set to the-examples-book. Your screen should look similar to the following.

GitHub Desktop
Figure 1. GitHub Desktop screen

We want to add the following Python example to the Python module.

import pandas as pd

myDF = pd.read_csv("./grades_semi.csv", sep=";")
myDF.head()
  1. First, create a new branch from the "main" branch describing what you are doing, for example, "add-example-reading-data". To do this, click on the middle tab in GitHub Desktop which will show your current branch. Switch branches to the "main" branch. Once complete, click on the middle tab again, and click the New Branch button.

    Click the New Branch button
    Figure 2. Click the New Branch button
  2. You will be presented with a field and description. Ensure that you are creating a branch from the "main" branch. Enter the new branch name in the text area.

    Branch names must be unique, and not already exist.
    Create the new branch
    Figure 3. Name and create the new branch
  3. Next, look at the current structure of the book and determine which module the example belongs in. In this case, this is a Python example, and therefore most likely belongs in the Python module.

  4. Next, create a new file with the code of the example, and place this file in the examples directory in the python directory. We can see that there is already a python/examples/example01.py file, so let’s call our example example02.py.

  5. If we have already determined we are just adding an example (vs. a page, or module), it is likely that there is already a page where this example fits appropriately. In this case, this is an example of ready a semi-colon separated file using pandas. Looking in the Python module, we can see that the pandas-read-write-data.adoc page is where this example belongs. Open the document in your favorite text editor, and add the following content in the Examples section of the page.

    ==== How do I read a csv file called `grades_semi.csv` into a `pandas` DataFrame, where `grades_semi.csv` is semi-colon-separated instead of comma-separated?
    
    .Solution
    ====
    [source, python]
    ----
    import pandas as pd
    
    myDF = pd.read_csv("./grades_semi.csv", sep=";")
    myDF.head()
    ----
    
    ----
       grade       year
    0    100     junior
    1     99  sophomore
    2     75  sophomore
    3     74  sophomore
    4     44     senior
    ----
    ====
  6. Upon saving the document, you will see that GitHub Desktop shows the added content in our pandas-read-write-data.adoc page. This addition is staged.

    Staged change in GitHub Desktop
    Figure 4. Staged change in GitHub Desktop
  7. The next step is commit the changes. Check the files you’d like to commit in the left-hand section of GitHub Desktop. In the lower left-hand section of GitHub Desktop, create a descriptive commit message, for example, "Add an example reading in semi-colon-separated data in Python". In the larger text area, you can add any other important details.

    All text entered in the description section is Markdown friendly — feel free to use bullets or headers.
    Changes ready to commit
    Figure 5. Changes ready to commit
  8. Once the changes are committed, and you are ready to incorporate these changes to the book, the next step is to publish (or push) this branch to our remote repository (hosted on GitHub).

    Branch ready to publish
    Figure 6. Branch ready to publish
  9. Click Publish branch. You should see a screen indicating the branch is being published.

    Publishing branch
    Figure 7. Publishing branch
  10. Once published, your branch will be available to everyone using GitHub. In order to incorporate the changes in this branch to the "main" branch, we must create a pull request or merge request. Click the blue Create Pull Request button.

    Create pull request
    Figure 8. Click Create Pull Request
  11. This will launch you browser and present you with the following screen. On this screen, you can (and should) assign a reviewer (the individual responsible for reviewing the content), designate an assignee (the individual who will actually merge the content), assign a label, and add any further comments.

    Open pull request
    Figure 9. Open pull request

    At this stage, you are done. All you need to do is respond to any follow up questions by the assignee or reviewer (which you will get emailed if/when this happens).

    In time, your content will be reviewed, and merged. Once merged, it is a matter of waiting ~5 minutes until the book is compiled, deployed, and the search index is updated. Congratulations, and thank you!

Adding a page

This example assumes:

  1. the-examples-book repository is located on your local machine at /Users/myusername/projects/the-examples-book.

  2. You’ve installed and configured GitHub Desktop, and your Current Repository (in the upper left-hand corner) is set to the-examples-book. Your screen should look similar to the following.

GitHub Desktop
Figure 10. GitHub Desktop screen

We notice there isn’t any content on pandas DataFrame’s, and we think that info would fit nicely into a separate page.

  1. First, create a new branch from the "main" branch describing what you are doing, for example, "add-pd-dataframe-page". To do this, click on the middle tab in GitHub Desktop which will show your current branch. Switch branches to the "main" branch. Once complete, click on the middle tab again, and click the New Branch button.

    Click the New Branch button
    Figure 11. Click the New Branch button
  2. You will be presented with a field and description. Ensure that you are creating a branch from the "main" branch. Enter the new branch name in the text area.

    Branch names must be unique, and not already exist.
    Create the new branch
    Figure 12. Name and create the new branch
  3. Next, look at the current structure of the book and determine which module the new page belongs in. In this case, this is a Python specific page, and therefore belongs in the Python module.

  4. Next, create a new file in the Python module’s "pages" directory called pandas-dataframes.adoc. Add the following content to the new page.

    = DataFrames
    
    Some content.
    
    == `some_function or method`
    
    `some_function or method` description.
    
    === Examples
    
    ==== Question text
    
    === Resources
    This structure is appropriate for most pages in the book. If you are creating a page that doesn’t fit this format, do your best and changes may be suggested in the review process.
  5. Make any other additions or changes you’d like to make to the contents of the new page, and save.

  6. Open up the module’s nav.adoc file and add navigation to the new page. In this example, the following would be an appropriate before and after for nav.adoc.

    Before
    * xref:introduction.adoc[Python]
    ** xref:pandas-intro.adoc[pandas]
    *** xref:pandas-read-write-data.adoc[Reading & Writing Data]
    After
    * xref:introduction.adoc[Python]
    ** xref:pandas-intro.adoc[pandas]
    *** xref:pandas-read-write-data.adoc[Reading & Writing Data]
    *** xref:pandas-dataframes.adoc[DataFrames]
    You can nest pages however you’d like — although it is best to keep it well organized. You can read more about navigation using Antora here.
  7. Upon saving the document, you will see that GitHub Desktop shows the added content in our pandas-read-write-data.adoc page. This addition is staged.

    Staged change in GitHub Desktop
    Figure 13. Staged change in GitHub Desktop
  8. The next step is commit the changes. Check the files you’d like to commit in the left-hand section of GitHub Desktop. In the lower left-hand section of GitHub Desktop, create a descriptive commit message, for example, "Add a page that covers pandas DataFrame’s". In the larger text area, you can add any other important details.

    All text entered in the description section is Markdown friendly — feel free to use bullets or headers.
    Changes ready to commit
    Figure 14. Changes ready to commit
  9. Once the changes are committed, and you are ready to incorporate these changes to the book, the next step is to publish (or push) this branch to our remote repository (hosted on GitHub).

    Branch ready to publish
    Figure 15. Branch ready to publish
  10. Click Publish branch. You should see a screen indicating the branch is being published.

    Publishing branch
    Figure 16. Publishing branch
  11. Once published, your branch will be available to everyone using GitHub. In order to incorporate the changes in this branch to the "main" branch, we must create a pull request or merge request. Click the blue Create Pull Request button.

    Create pull request
    Figure 17. Click Create Pull Request
  12. This will launch you browser and present you with the following screen. On this screen, you can (and should) assign a reviewer (the individual responsible for reviewing the content), designate an assignee (the individual who will actually merge the content), assign a label, and add any further comments.

    Open pull request
    Figure 18. Open pull request

    At this stage, you are done. All you need to do is respond to any follow up questions by the assignee or reviewer (which you will get emailed if/when this happens).

    In time, your content will be reviewed, and merged. Once merged, it is a matter of waiting ~5 minutes until the book is compiled, deployed, and the search index is updated. Congratulations, and thank you!

Adding a module

Adding an appendix

What we are referring to as an "appendix", Antora calls a "component". If a particular topic is largely outside of the scope of The Data Mine curriculum for most students, it may belong in an appendix instead. Adding an appendix to the book is straightforward.

  1. Create a new repository (public) with the following, minimal structure. Alternatively, you can copy the following files/directories from an existing appendix repository and copy into your new one: .github, .gitignore, content (directory), readme.md.

    my_repo2
    └── content
        ├── antora.yml
        └── modules
            └── ROOT
                ├── nav.adoc
                └── pages
    
    4 directories, 2 files
  2. Fill in the content of the antora.yml configuration file. The following is an example of a minimal antora.yml file.

    name: example-appendix
    title: Example Appendix
    version: master
    display_version: stable
    start_page: ROOT:introduction.adoc
    nav:
    - modules/ROOT/nav.adoc
    Choose the name carefully. The name ends up appearing as the first value in the URL’s path and should have no spaces. For example: https://the-examples-book.com/example-appendix/<version>/<page-name>.
  3. Create and place the desired AsciiDoc content in the pages directory. The introduction.adoc page will be the landing page for your appendix. For new high level pages, add lines to the nav.adoc file. If you do create further navigation in nav.adoc, you can then create new .adoc files with the corresponding name in the pages directory. Make sure to follow our page naming convention.

  4. When you are satisfied with the results, modify the antora-playbook.yml file in our primary repository, and create a Pull Request. This is as simple as adding another item to our content  sources section of the .yml file. For example:

    - url: https://github.com/my_organization/my_appendix_repo
      branches: main
      start_paths: content

    In the primary repository you will also need to change the introduction.adoc file add a link on the homepage to your appendix. For example:

    * xref:example-appendix:ROOT:introduction.adoc[Example Appendix]

    When you are done making updates make sure to push all changes in both your new repository and the primary Examples Book repository. All content incorporated into this book will be indexed and added to the search functionality by the maintainer.

    Check out this book’s antora-playbook.yml file and scholar appendix for more details.

Designing for accessibility

Accessibility in website and content design is a vital aspect that is often overlooked. When adding new content to The Examples Book it is recommended that you follow as many of the accessibility guidelines below as possible. In addition, the Data Mine staff is continuing to learn about how we can make the examples book more accessible. If you have any suggestions, please email us at [email protected].

It’s also important for us to note that we are not experts in accessibility. Our tips are based on the sources below, but the list is not exhaustive by any means. In our limited experience we recommend thinking through the different ways that people may access the site and attempting to design for as many of those ways as possible.

High-level Checklist for Web and Content Design

  • Is the site consistent in design (headings, lists, hierarchies)?

  • Is special text, such as code or quotes, easy to identify and understand?

  • Does the text avoid abbreviations and clearly explain topic specific terms or phrases.

  • Do videos have captions, dictations, or other alternative communication methods?

  • Do images have alternative text?

  • If the images contain important information are there other methods for gaining the information?

  • Are visualizations designed for easy understanding and clear readability?

  • Are the visualizations designed for color blind individuals?

Tips for Checking Your Design

  • Test with text-to-voice tools, such as VoiceOver for Mac.

  • Attempt to navigate the site without sound on the videos or other media.

  • Test visualizations with software tools that simulate colorblindness.

Organization

This book is organized into the core book, and supplementary appendices.

The core book contains content we hope every student learns. For example, an code snippet showing how to read and write data to a file using Python, would most likely get added somewhere in the core book.

Supplementary appendices are reserved for topics that are largely beyond the scope of The Data Mine curriculum for most students. In certain cases, a corporate partner’s team may be required to quickly learn about a topic. For example, perhaps a team is working on an app to translate natural language to an SQL query. In this scenario, a mentor may ask students to take a look at content in the NLP appendix. Before creating a Pull Request to add content, please take a moment to figure out where that content most appropriately fits. When in doubt, please feel free to ask!

Within the core book or an appendix, you will find a certain file structure.

Example book or appendix structure
my_repo
└── content (1)
    ├── antora.yml (2)
    └── modules (3)
        ├── ROOT (4)
        │   ├── attachments (5)
        │   │   └── example-project.zip
        │   ├── examples (6)
        │   │   ├── example01.py
        │   │   ├── example02.R
        │   │   └── example03.sh
        │   ├── images (7)
        │   │   ├── figure01.png
        │   │   └── figure02.png
        │   ├── nav.adoc (10)
        │   ├── pages (8)
        │   │   ├── getting-started.adoc
        │   │   └── storing-data.adoc
        │   └── partials (9)
        │       └── warning.adoc
        ├── module1 (11)
        │   ├── attachments
        │   ├── examples
        │   ├── images
        │   ├── nav.adoc
        │   ├── pages
        │   └── partials
        └── module2
            ├── attachments
            ├── examples
            ├── images
            ├── nav.adoc
            ├── pages
            └── partials

20 directories, 13 files

The file structure is important. Special keywords in the provided example that should not be modified are:

1 content
2 antora.yml
3 modules
4 ROOT
5 attachments
6 examples
7 images
8 pages
9 partials
10 nav.adoc
While not specifically highlighted in the provided example, the attachments, examples, images, pages, and partials folders are examples of repeated keywords, or keywords that are reused in each module. While nav.adoc is not technically a special keyword, we would like to keep this consistent with all modules and appendices added to the book.

In addition, in the core book repository, you will find a file called antora-playbook.yml in the root of the repository. This file is responsible for pulling in all of the resources for the core book and all of the dependencies.

This book uses Antora to render the online book. The following is a summary of information that can be found online. For a more in-depth look at Antora, please see the official documentation.
content

This is a directory at the root-level of the repository which should contain the antora.yml file and a modules directory.

antora.yml

This is a configuration file that contains information about the book or appendix contained within the sibling modules directory. At the time of writing, this is what the core book’s antora.yml file looked like.

name: book (1)
title: The Examples Book (2)
version: 0.1.0 (3)
start_page: ROOT:introduction.adoc (4)
nav:
- modules/ROOT/nav.adoc (5)
- modules/scholar/nav.adoc
- modules/unix/nav.adoc
- modules/SQL/nav.adoc
- modules/r/nav.adoc
- modules/python/nav.adoc
- modules/FAQs/nav.adoc
- modules/projects/nav.adoc
- modules/corporate_partners/nav.adoc
- modules/geospatial/nav.adoc
- modules/contributors/nav.adoc
1 This is the name of the core book or appendix. This appears as the first value in our URL’s path. For example: https://the-examples-book.com/book/<version>/<page-name>.
2 The title of the core book or appendix. This typically appears to the right of the favicon in the browser tab.
3 The version of the core book or appendix. This appears as the second value in our URL’s path. For example: https://the-examples-book.com/book/0.1.0/<page-name>.
4 The location of the "home" page for the core book or appendix. Where do we want Antora to start?
5 A list of nav.adoc files. These files contain a list of anchor links to that appear in the left-hand menu. For example, at the time of writing this, modules/ROOT/nav.adoc looked like this.
* xref:introduction.adoc[Introduction]
* xref:how-to-contribute.adoc[How to contribute]
modules

The directory containing (at a minimum) the ROOT module, and any other custom-named modules you would like. In our example we also had a module1 module and a module2 module.

ROOT

The ROOT module. This is the default, required module. No further modules are required. An appendix or core book may not need more than the ROOT module. In fact, our scholar appendix only contains the ROOT module.

attachments

A "family directory" that should be used to store binaries or other large files. These attachments can then be referenced internally. See here.

examples

A "family directory" that should be used to store examples. For this book, examples will most likely be code snippets. In practice, any potentially reuseable code snippet should be added to the examples directory. This enables maintainers to update a code snippet once, and have it applied everywhere it appears in the book. These examples can be referenced internally, and included in a code chunk. See here.

images

A "family directory" that should be used to store all figures and images. For this book, we will require all images to be labeled figureXXX where XXX is a number starting at 001 and ending at 999. These images can be referenced internally. See here.

pages

A "family directory" that should be used to store each page of the book. These are AsciiDoc files. This is where the actual content lives.

In this book, each .adoc file should be named the same as the page title, in all lowercase, with spaces replaced by hyphens. For example, a page with the title "Introduction to Python" should be named introduction-to-python.adoc.

partials

A "family directory" that should be used to store content snippets. Provided examples include common descriptions, terminology, or referenced tables. Any content that doesn’t fit into a previously named "family directory", but will most likely be reused in multiple pages should be added as a partial to the partials directory. Partials can be referenced internally. See here.

nav.adoc

This folder contains a list of anchor links that end up as your navigation links in the menu on the left hand side of the book. For example, at the time of writing this, modules/ROOT/nav.adoc looked like this.

* xref:introduction.adoc[Introduction]
* xref:how-to-contribute.adoc[How to contribute]

AsciiDoc

This book is written using AsciiDoc. AsciiDoc is an open and powerful format for writing notes, text documents, books, etc. It is easy to write technical documentation in AsciiDoc, and quickly convert the text to various mediums like websites, ebooks, pdfs, etc.

Below are a variety of AsciiDoc examples from powerman. To see how content is rendered, compare the adoc file for this page, to it’s output below.

Headers

Level 1

Text.

Level 2

Text.

Level 3

Text.

Level 4

Text.

Level 5

Text.

Paragraphs

Optional Title

Usual paragraph.

Optional Title
Literal paragraph.
	Must be indented
Optional Title
list = ['first', 'second', 'third',]

Not code in the next paragraph.

Optional Title
This is an example of a single-paragraph note.
This is a tip.
This is important.
This is a caution note.
This is a warning.

Blocks

Optional Title
*Listing* Block

Use for code of file listings
Optional Title
import pandas as pd

myDF = pd.read_csv("myFile.csv")

Alternatively, you can also insert the content of source code files in a modules examples directory:

File in same module
import pandas as pd

myDF = pd.read_csv("myFile.csv")
File in different module
Unresolved include directive in modules/ROOT/pages/how-to-contribute.adoc - include::python:example$example01.py[]
Optional Title

Sidebar Block

Use for sidebar notes.

Optional Title

Example Block

Use for examples

The default caption "Example:" can be changed using [caption="Custom: "] before the example block.

Optional Title

Note Block

Use for multi-paragraph notes.

*Passthrough* Block

Use for backend specific markup like:

1 2

Optional Title
*Literal* Block

Use when literal paragraph (indented) like
	1. First
	2. Second

is incorrectly processed as list.
Optional Title

Quote Block

Use to cite somebody.

— cite author
cite source

Text

forced
line break

bold text

italic text

monospaced text

highlighted text

‘double quoted’

“single quoted”

textsuper

textsub

A command: echo "some command"

monospaced and *bold*

passthrough bold

Special symbols: © ® ™ — …​ → ← ⇒ ⇐ ¶

Escape characters: _italic_, _italic_, t__e__st, t__e__st, bold, <b>normal</b>, &#182;, \`not single quoted', \``not double quoted''

Text for the custom anchor link custom-anchor-link-02.

Footnotes

This is true.[1]

Really, it is.[1]

Images

We want to keep this site fast. The following are best practices for adding images to this book.

  • Reduce the size of images using the instructions here.

  • Specify both the height and width of each image.

  • Lazy load images by adding loading="lazy".

  • Use block images when possible.

  • Put all images in the associated modules "images" folder, and label them figure01.webp, figure02.webp, etc.

  • Always add a caption by adding title="Description of image."

Inline images

To include an image inline from same module: Alternate text for figure01

To include an image inline from different module: Alternate text for CRS

Block images

To include a block image from the same module:

Alternate text for figure01

To include a block image from a different module:

Alternate text for CRS

Lists

  • Unordered list

  • Another item

  • Another item

    Some important note really quick.
  • Back to the list

    1. Ordered list

    2. Another item

    3. Another item

      Some important note really quick.
    4. Back to the list

      First

      First item in description list.

      Second

      Second item in description list.

      • Checked item 1

      • Checked item 2

      • Unchecked item

UI Macros

Button syntax

Press Submit when you are ready to submit and Cancel, if you want to cancel.

Keybinding syntax

To exit vim press Ctrl+C, type :wq, and press Enter.

To exit, click on File  Close.

To export as PDF, click on File  Export as…​  PDF.

Tables

LaTeX

We use MathJax to render LaTeX. Our current configurations allow inline equations using \$'s and display equations using \$\$'s. To escape a regular \$, you can prepend a \, so, for example, to write \$2.00 you would type \$2.00.

This is an example of an inline equation: $e=mc^2$

We’ve configured MathJax to not process text in the following tags: <script>, <noscript>, <style>, <textarea>, <pre>, <annotation>, <annotation-xml>, and <code>.

Optimize images

We want to keep this book as fast to load as possible. The following are our suggestions to optimize images prior to adding them to this book.

Resize the image to the width and height that it will be displayed online
magick convert input.png -filter Gaussian -sharpen 0x3 -resize 700x500 output.png
Reduce the number of colors used to 64
pngquant 64 --skip-if-larger --strip --ext=.png --force output.png
Use improved compression algorithm
zopflipng -y output.png output.png
Compress to webp
cwebp -q 30 output.png -o output.webp

You can read more about some of these optimizations in this excellent GitLab blog post.

Automator

If you are working on an Apple machine, you can setup Automator to automatically convert all `.png’s as they are added to the repository.

  1. Open Automator.app, and select Folder Action, then click Choose.

    Automator menu
    Figure 19. Automator menu
  2. At the top of the screen choose folder where your the-examples-book repository lives. Drag and drop Run Shell Script from the left-hand menu.

    Automator setup
    Figure 20. Folder action setup
  3. Change Pass input to as arguments, and add the following script to the code block.

    for f in "[email protected]"
    do
    	if [[ $f == *.png ]]
    	then
    		/usr/local/bin/magick convert "$f" -filter Gaussian -sharpen 0x3 -resize 50% "$f"
    		/usr/local/bin/pngquant 64 --skip-if-larger --strip --ext=.png --force "$f"
    		/usr/local/bin/zopflipng -y "$f" "$f"
    		/usr/local/bin/cwebp -q 30 "$f" -o "${f%.*}.webp"
    		rm "$f"
    	fi
    done
  4. Open finder and navigate to each "images" folders which you’d like automator to watch, and automatically convert and optimize .png files. This can be easily done by holding Ctrl+Click on the directory and attaching the workflow to each folder.

This is experimental. If you have a better way to do this, please feel free to contribute.


1. At least in my opinion.