How to contribute

Adding content

If you are interested in making a contribution to this book, the first step is to install and configure GitHub Desktop. Then, use the quick-start links provided below.

We made an example module to help with any new content. It will be under template-module in the repo and the example content page can be found here. Feel free to make a copy and use it for your own module!

Adding an example
Adding a page
Adding a module
Adding an appendix
Designing for accessibility

To learn more about how the book is organized, see our organization section. To learn about and see some AsciiDoc examples, see our AsciiDoc section.

Adding an example

This example assumes:

the-examples-book repository is located on your local machine at /Users/myusername/projects/the-examples-book.
You’ve installed and configured GitHub Desktop, and your Current Repository (in the upper left-hand corner) is set to the-examples-book. Your screen should look similar to the following.

Figure 1. GitHub Desktop screen

We want to add the following Python example to the Python module.

import pandas as pd

myDF = pd.read_csv("./grades_semi.csv", sep=";")
myDF.head()

First, create a new branch from the "main" branch describing what you are doing, for example, "add-example-reading-data". To do this, click on the middle tab in GitHub Desktop which will show your current branch. Switch branches to the "main" branch. Once complete, click on the middle tab again, and click the New Branch button.

Figure 2. Click the New Branch button
You will be presented with a field and description. Ensure that you are creating a branch from the "main" branch. Enter the new branch name in the text area.

Branch names must be unique, and not already exist.

Figure 3. Name and create the new branch
Next, look at the current structure of the book and determine which module the example belongs in. In this case, this is a Python example, and therefore most likely belongs in the Python module.
Next, create a new file with the code of the example, and place this file in the examples directory in the python directory. We can see that there is already a python/examples/example01.py file, so let’s call our example example02.py.
If we have already determined we are just adding an example (vs. a page, or module), it is likely that there is already a page where this example fits appropriately. In this case, this is an example of ready a semi-colon separated file using pandas. Looking in the Python module, we can see that the pandas-read-write-data.adoc page is where this example belongs. Open the document in your favorite text editor, and add the following content in the Examples section of the page.
```
==== How do I read a csv file called `grades_semi.csv` into a `pandas` DataFrame, where `grades_semi.csv` is semi-colon-separated instead of comma-separated?

.Solution
====
[source, python]
----
import pandas as pd

myDF = pd.read_csv("./grades_semi.csv", sep=";")
myDF.head()
----

----
   grade       year
0    100     junior
1     99  sophomore
2     75  sophomore
3     74  sophomore
4     44     senior
----
====
```
Upon saving the document, you will see that GitHub Desktop shows the added content in our pandas-read-write-data.adoc page. This addition is staged.

Figure 4. Staged change in GitHub Desktop
The next step is commit the changes. Check the files you’d like to commit in the left-hand section of GitHub Desktop. In the lower left-hand section of GitHub Desktop, create a descriptive commit message, for example, "Add an example reading in semi-colon-separated data in Python". In the larger text area, you can add any other important details.

All text entered in the description section is Markdown friendly — feel free to use bullets or headers.

Figure 5. Changes ready to commit
Once the changes are committed, and you are ready to incorporate these changes to the book, the next step is to publish (or push) this branch to our remote repository (hosted on GitHub).

Figure 6. Branch ready to publish
Click Publish branch. You should see a screen indicating the branch is being published.

Figure 7. Publishing branch
Once published, your branch will be available to everyone using GitHub. In order to incorporate the changes in this branch to the "main" branch, we must create a pull request or merge request. Click the blue Create Pull Request button.

Figure 8. Click Create Pull Request
This will launch you browser and present you with the following screen. On this screen, you can (and should) assign a reviewer (the individual responsible for reviewing the content), designate an assignee (the individual who will actually merge the content), assign a label, and add any further comments.

Figure 9. Open pull request

At this stage, you are done. All you need to do is respond to any follow up questions by the assignee or reviewer (which you will get emailed if/when this happens).

In time, your content will be reviewed, and merged. Once merged, it is a matter of waiting ~5 minutes until the book is compiled, deployed, and the search index is updated. Congratulations, and thank you!

Adding a page

This example assumes:

the-examples-book repository is located on your local machine at /Users/myusername/projects/the-examples-book.
You’ve installed and configured GitHub Desktop, and your Current Repository (in the upper left-hand corner) is set to the-examples-book. Your screen should look similar to the following.

Figure 10. GitHub Desktop screen

We notice there isn’t any content on pandas DataFrame’s, and we think that info would fit nicely into a separate page.

First, create a new branch from the "main" branch describing what you are doing, for example, "add-pd-dataframe-page". To do this, click on the middle tab in GitHub Desktop which will show your current branch. Switch branches to the "main" branch. Once complete, click on the middle tab again, and click the New Branch button.

Figure 11. Click the New Branch button
You will be presented with a field and description. Ensure that you are creating a branch from the "main" branch. Enter the new branch name in the text area.

Branch names must be unique, and not already exist.

Figure 12. Name and create the new branch
Next, look at the current structure of the book and determine which module the new page belongs in. In this case, this is a Python specific page, and therefore belongs in the Python module.
Next, create a new file in the Python module’s "pages" directory called pandas-dataframes.adoc. Add the following content to the new page.
```
= DataFrames

Some content.

== `some_function or method`

`some_function or method` description.

=== Examples

==== Question text

=== Resources
```
This structure is appropriate for most pages in the book. If you are creating a page that doesn’t fit this format, do your best and changes may be suggested in the review process.
Make any other additions or changes you’d like to make to the contents of the new page, and save.
Open up the module’s nav.adoc file and add navigation to the new page. In this example, the following would be an appropriate before and after for nav.adoc.
Before
```
* xref:index.adoc[Python]
** xref:pandas-intro.adoc[pandas]
*** xref:pandas-read-write-data.adoc[Reading & Writing Data]
```
After
```
* xref:index.adoc[Python]
** xref:pandas-intro.adoc[pandas]
*** xref:pandas-read-write-data.adoc[Reading & Writing Data]
*** xref:pandas-dataframes.adoc[DataFrames]
```
You can nest pages however you’d like — although it is best to keep it well organized. You can read more about navigation using Antora here.
Upon saving the document, you will see that GitHub Desktop shows the added content in our pandas-read-write-data.adoc page. This addition is staged.

Figure 13. Staged change in GitHub Desktop
The next step is commit the changes. Check the files you’d like to commit in the left-hand section of GitHub Desktop. In the lower left-hand section of GitHub Desktop, create a descriptive commit message, for example, "Add a page that covers pandas DataFrame’s". In the larger text area, you can add any other important details.

All text entered in the description section is Markdown friendly — feel free to use bullets or headers.

Figure 14. Changes ready to commit
Once the changes are committed, and you are ready to incorporate these changes to the book, the next step is to publish (or push) this branch to our remote repository (hosted on GitHub).

Figure 15. Branch ready to publish
Click Publish branch. You should see a screen indicating the branch is being published.

Figure 16. Publishing branch
Once published, your branch will be available to everyone using GitHub. In order to incorporate the changes in this branch to the "main" branch, we must create a pull request or merge request. Click the blue Create Pull Request button.

Figure 17. Click Create Pull Request
This will launch you browser and present you with the following screen. On this screen, you can (and should) assign a reviewer (the individual responsible for reviewing the content), designate an assignee (the individual who will actually merge the content), assign a label, and add any further comments.

Figure 18. Open pull request

At this stage, you are done. All you need to do is respond to any follow up questions by the assignee or reviewer (which you will get emailed if/when this happens).

In time, your content will be reviewed, and merged. Once merged, it is a matter of waiting ~5 minutes until the book is compiled, deployed, and the search index is updated. Congratulations, and thank you!

Adding a module

Adding an appendix

What we are referring to as an "appendix", Antora calls a "component". If a particular topic is largely outside of the scope of The Data Mine curriculum for most students, it may belong in an appendix instead. Adding an appendix to the book is straightforward.

Create a new repository (public) with the following, minimal structure. Alternatively, you can copy the following files/directories from an existing appendix repository and copy into your new one: .github, .gitignore, content (directory), readme.md.
```
my_repo2
└── content
    ├── antora.yml
    └── modules
        └── ROOT
            ├── nav.adoc
            └── pages

4 directories, 2 files
```

Fill in the content of the antora.yml configuration file. The following is an example of a minimal antora.yml file.

name: example-appendix
title: Example Appendix
version: master
display_version: stable
start_page: ROOT:index.adoc
nav:
- modules/ROOT/nav.adoc

Choose the name carefully. The name ends up appearing as the first value in the URL’s path and should have no spaces. For example: https://the-examples-book.com/example-appendix/<version>/<page-name>.

Create and place the desired AsciiDoc content in the pages directory. The index.adoc page will be the landing page for your appendix. For new high level pages, add lines to the nav.adoc file. If you do create further navigation in nav.adoc, you can then create new .adoc files with the corresponding name in the pages directory. Make sure to follow our page naming convention.
When you are satisfied with the results, modify the antora-playbook.yml file in our primary repository, and create a Pull Request. This is as simple as adding another item to our content sources section of the .yml file. For example:
```
- url: https://github.com/my_organization/my_appendix_repo
  branches: main
  start_paths: content
```
In the primary repository you will also need to change the index.adoc file add a link on the homepage to your appendix. For example:
```
* xref:example-appendix:ROOT:index.adoc[Example Appendix]
```
When you are done making updates make sure to push all changes in both your new repository and the primary Examples Book repository. All content incorporated into this book will be indexed and added to the search functionality by the maintainer.

Check out this book’s antora-playbook.yml file for more details.

Designing for accessibility

Accessibility in website and content design is a vital aspect that is often overlooked. When adding new content to The Examples Book it is recommended that you follow as many of the accessibility guidelines below as possible. In addition, the Data Mine staff is continuing to learn about how we can make the examples book more accessible. If you have any suggestions, please email us at [email protected].

It’s also important for us to note that we are not experts in accessibility. Our tips are based on the sources below, but the list is not exhaustive by any means. In our limited experience we recommend thinking through the different ways that people may access the site and attempting to design for as many of those ways as possible.

High-level Checklist for Web and Content Design

Is the site consistent in design (headings, lists, hierarchies)?
Is special text, such as code or quotes, easy to identify and understand?
Does the text avoid abbreviations and clearly explain topic specific terms or phrases.
Do videos have captions, dictations, or other alternative communication methods?
Do images have alternative text?
If the images contain important information are there other methods for gaining the information?
Are visualizations designed for easy understanding and clear readability?
Are the visualizations designed for color blind individuals?

Tips for Checking Your Design

Test with text-to-voice tools, such as VoiceOver for Mac.
Attempt to navigate the site without sound on the videos or other media.
Test visualizations with software tools that simulate colorblindness.

Organization

This book is organized into the core book, and supplementary appendices.

The core book contains content we hope every student learns. For example, an code snippet showing how to read and write data to a file using Python, would most likely get added somewhere in the core book.

Supplementary appendices are reserved for topics that are largely beyond the scope of The Data Mine curriculum for most students. In certain cases, a corporate partner’s team may be required to quickly learn about a topic. For example, perhaps a team is working on an app to translate natural language to an SQL query. In this scenario, a mentor may ask students to take a look at content in the NLP appendix. Before creating a Pull Request to add content, please take a moment to figure out where that content most appropriately fits. When in doubt, please feel free to ask!

Within the core book or an appendix, you will find a certain file structure.

Example book or appendix structure

my_repo
└── content (1)
    ├── antora.yml (2)
    └── modules (3)
        ├── ROOT (4)
        │   ├── attachments (5)
        │   │   └── example-project.zip
        │   ├── examples (6)
        │   │   ├── example01.py
        │   │   ├── example02.R
        │   │   └── example03.sh
        │   ├── images (7)
        │   │   ├── figure01.png
        │   │   └── figure02.png
        │   ├── nav.adoc (10)
        │   ├── pages (8)
        │   │   ├── getting-started.adoc
        │   │   └── storing-data.adoc
        │   └── partials (9)
        │       └── warning.adoc
        ├── module1 (11)
        │   ├── attachments
        │   ├── examples
        │   ├── images
        │   ├── nav.adoc
        │   ├── pages
        │   └── partials
        └── module2
            ├── attachments
            ├── examples
            ├── images
            ├── nav.adoc
            ├── pages
            └── partials

20 directories, 13 files

The file structure is important. Special keywords in the provided example that should not be modified are:

1	`content`
2	`antora.yml`
3	`modules`
4	`ROOT`
5	`attachments`
6	`examples`
7	`images`
8	`pages`
9	`partials`
10	`nav.adoc`

While not specifically highlighted in the provided example, the attachments, examples, images, pages, and partials folders are examples of repeated keywords, or keywords that are reused in each module. While nav.adoc is not technically a special keyword, we would like to keep this consistent with all modules and appendices added to the book.

In addition, in the core book repository, you will find a file called antora-playbook.yml in the root of the repository. This file is responsible for pulling in all of the resources for the core book and all of the dependencies.

This book uses Antora to render the online book. The following is a summary of information that can be found online. For a more in-depth look at Antora, please see the official documentation.

content

This is a directory at the root-level of the repository which should contain the antora.yml file and a modules directory.

antora.yml

This is a configuration file that contains information about the book or appendix contained within the sibling modules directory. At the time of writing, this is what the core book’s antora.yml file looked like.

name: book (1)
title: The Examples Book (2)
version: 0.1.0 (3)
start_page: ROOT:index.adoc (4)
nav:
- modules/ROOT/nav.adoc (5)
- modules/scholar/nav.adoc
- modules/unix/nav.adoc
- modules/SQL/nav.adoc
- modules/r/nav.adoc
- modules/python/nav.adoc
- modules/FAQs/nav.adoc
- modules/projects/nav.adoc
- modules/corporate_partners/nav.adoc
- modules/geospatial/nav.adoc
- modules/contributors/nav.adoc

1	This is the name of the core book or appendix. This appears as the first value in our URL’s path. For example: https://the-examples-book.com/book/<version>/<page-name>.
2	The title of the core book or appendix. This typically appears to the right of the favicon in the browser tab.
3	The version of the core book or appendix. This appears as the second value in our URL’s path. For example: https://the-examples-book.com/book/0.1.0/<page-name>.
4	The location of the "home" page for the core book or appendix. Where do we want Antora to start? Note: If we want Antora to start at `ROOT:index.adoc` then we can remove this line altogether, because this is the default.
5	A list of `nav.adoc` files. These files contain a list of anchor links to that appear in the left-hand menu. For example, at the time of writing this, `modules/ROOT/nav.adoc` looked like this. `* xref:index.adoc[Introduction] * xref:how-to-contribute.adoc[How to contribute]`

modules

The directory containing (at a minimum) the ROOT module, and any other custom-named modules you would like. In our example we also had a module1 module and a module2 module.

ROOT

The ROOT module. This is the default, required module. No further modules are required. An appendix or core book may not need more than the ROOT module. In fact, our scholar appendix only contains the ROOT module.

attachments

A "family directory" that should be used to store binaries or other large files. These attachments can then be referenced internally. See here.

examples

A "family directory" that should be used to store examples. For this book, examples will most likely be code snippets. In practice, any potentially reuseable code snippet should be added to the examples directory. This enables maintainers to update a code snippet once, and have it applied everywhere it appears in the book. These examples can be referenced internally, and included in a code chunk. See here.

images

A "family directory" that should be used to store all figures and images. For this book, we will require all images to be labeled figureXXX where XXX is a number starting at 001 and ending at 999. These images can be referenced internally. See here.

pages

A "family directory" that should be used to store each page of the book. These are AsciiDoc files. This is where the actual content lives.

In this book, each .adoc file should be named the same as the page title, in all lowercase, with spaces replaced by hyphens. For example, a page with the title "Introduction to Python" should be named introduction-to-python.adoc.

partials

A "family directory" that should be used to store content snippets. Provided examples include common descriptions, terminology, or referenced tables. Any content that doesn’t fit into a previously named "family directory", but will most likely be reused in multiple pages should be added as a partial to the partials directory. Partials can be referenced internally. See here.

nav.adoc

This folder contains a list of anchor links that end up as your navigation links in the menu on the left hand side of the book. For example, at the time of writing this, modules/ROOT/nav.adoc looked like this.

* xref:index.adoc[Introduction]
* xref:how-to-contribute.adoc[How to contribute]

AsciiDoc

This book is written using AsciiDoc. AsciiDoc is an open and powerful format for writing notes, text documents, books, etc. It is easy to write technical documentation in AsciiDoc, and quickly convert the text to various mediums like websites, ebooks, pdfs, etc.

Below are a variety of AsciiDoc examples from powerman. To see how content is rendered, compare the adoc file for this page, to it’s output below.

Headers

Level 1

Text.

Level 2

Text.

Level 3

Text.

Level 4

Text.

Level 5

Text.

Paragraphs

Optional Title

Usual paragraph.

Optional Title

Literal paragraph.
	Must be indented

Optional Title

list = ['first', 'second', 'third',]

Not code in the next paragraph.

Optional Title

This is an example of a single-paragraph note.

This is a tip.

This is important.

This is a caution note.

This is a warning.

Blocks

Optional Title

*Listing* Block

Use for code of file listings

Optional Title

import pandas as pd

myDF = pd.read_csv("myFile.csv")

Alternatively, you can also insert the content of source code files in a modules examples directory:

File in same module

##  include::../examples/example01.py[]

Optional Title

Sidebar Block

Use for sidebar notes.

Optional Title

Example Block

Use for examples

The default caption "Example:" can be changed using [caption="Custom: "] before the example block.

Optional Title

Note Block

Use for multi-paragraph notes.

*Passthrough* Block

Use for backend specific markup like:

Optional Title

*Literal* Block

Use when literal paragraph (indented) like
	1. First
	2. Second

is incorrectly processed as list.

Optional Title

Quote Block

Use to cite somebody.

— cite author
cite source

Text

forced
line break

bold text

italic text

monospaced text

highlighted text

‘double quoted’

“single quoted”

text^super

text_sub

A command: echo "some command"

monospaced and *bold*

passthrough bold

Escape characters: _italic_, _italic_, t__e__st, t__e__st, bold, <b>normal</b>, ¶, \`not single quoted', \``not double quoted''

Links

Text for the custom anchor link custom-anchor-link-01.

Text for the custom anchor link custom-anchor-link-02.

[custom-anchor-link-01]

My Anchor Link 1

[custom-anchor-link-02]

My Anchor Link 2

Link to Google

Footnotes

This is true.^[1]

Really, it is.^[1]

Images

We want to keep this site fast. The following are best practices for adding images to this book.

Reduce the size of images using the instructions here.
Specify both the height and width of each image.
Lazy load images by adding loading="lazy".
Use block images when possible.
Put all images in the associated modules "images" folder, and label them figure01.webp, figure02.webp, etc.
Always add a caption by adding title="Description of image."

Inline images

To include an image inline from same module:

To include an image inline from different module: Alternate text for CRS

Block images

To include a block image from the same module:

To include a block image from a different module:

Lists

Unordered list
Another item
Another item

Some important note really quick.
Back to the list
1. Ordered list
2. Another item
3. Another item
  
  Some important note really quick.
4. Back to the list
  First
  
  First item in description list.
  
  Second
  
  Second item in description list.
  
  Checked item 1
  
  Checked item 2
  
  Unchecked item

UI Macros

Button syntax

Press Submit when you are ready to submit and Cancel, if you want to cancel.

Keybinding syntax

To exit vim press Ctrl+C, type :wq, and press Enter.

To exit, click on File Close.

To export as PDF, click on File Export as… PDF.

Tables

LaTeX

We use MathJax to render LaTeX. Our current configurations allow inline equations using \$'s and display equations using \$\$'s. To escape a regular \$, you can prepend a \, so, for example, to write \$2.00 you would type \$2.00.

This is an example of an inline equation: $e=mc^2$

We’ve configured MathJax to not process text in the following tags: <script>, <noscript>, <style>, <textarea>, <pre>, <annotation>, <annotation-xml>, and <code>.

Optimize images

We want to keep this book as fast to load as possible. The following are our suggestions to optimize images prior to adding them to this book.

Resize the image to the width and height that it will be displayed online

magick convert input.png -filter Gaussian -sharpen 0x3 -resize 700x500 output.png

Reduce the number of colors used to 64

pngquant 64 --skip-if-larger --strip --ext=.png --force output.png

Use improved compression algorithm

zopflipng -y output.png output.png

Compress to webp

cwebp -q 30 output.png -o output.webp

You can read more about some of these optimizations in this excellent GitLab blog post.

Automator

If you are working on an Apple machine, you can setup Automator to automatically convert all `.png’s as they are added to the repository.

Open Automator.app, and select Folder Action, then click Choose.

Figure 19. Automator menu
At the top of the screen choose folder where your the-examples-book repository lives. Drag and drop Run Shell Script from the left-hand menu.

Figure 20. Folder action setup

Change Pass input to as arguments, and add the following script to the code block.

for f in "$@"
do
	if [[ $f == *.png ]]
	then
		/usr/local/bin/magick convert "$f" -filter Gaussian -sharpen 0x3 -resize 50% "$f"
		/usr/local/bin/pngquant 64 --skip-if-larger --strip --ext=.png --force "$f"
		/usr/local/bin/zopflipng -y "$f" "$f"
		/usr/local/bin/cwebp -q 30 "$f" -o "${f%.*}.webp"
		rm "$f"
	fi
done

Open finder and navigate to each "images" folders which you’d like automator to watch, and automatically convert and optimize .png files. This can be easily done by holding Ctrl+Click on the directory and attaching the workflow to each folder.

This is experimental. If you have a better way to do this, please feel free to contribute.

1. At least in my opinion.

How to contribute

Adding content

Adding an example

Adding a page

Adding a module

Adding an appendix

Designing for accessibility

Recommended Accessibility Links:

High-level Checklist for Web and Content Design

Tips for Checking Your Design

Organization

AsciiDoc

Headers

Level 1

Level 2

Level 3

Level 4

Level 5

Paragraphs

Blocks

Text

Links

Footnotes

Images

Inline images

Block images

Lists

UI Macros

Button syntax

Keybinding syntax

Menu syntax

Tables

LaTeX

Optimize images

Automator