Introduction

This post is long: feel free to skip parts, the subsections should be rather standalone.

Two months after my first visit to CERN, I was invited back for a second software carpentry workshop. The audience was very different from last time: most attendees were CERN summer students with little to no experience in programming. These students stay in CERN for around two months and have varied top-level physics talk every whole mornings.

The workshop was supported by the Citizen Cyberscience Centre and have been initiated by François Grey in the context of the CERN Webfest. I had the chance to co-instruct with Kevin Dungs which I had met in the last workshop. Overall, this workshop was once more an amazing and surprising experience.

Because I love it, let me once more plug-in the Large Hadron Rap.

Organization and Planning

This workshop was spanning 4 successive afternoons during which we covered the bash/python/git trio. In the context of the webfest, we decided to have one of the four afternoons talking about client-side web technologies.

The workshop filled up very fast and we had 28 registered persons (for free). A notable number of persons actually did not show up on the first day, which is a shame because some other would have attended. We topped at 22 attendees, with some lower numbers depending on the day… and maybe on the room?

We changed room every day within CERN which is quite big, so maybe we lost some people in the last minutes. To give a scale, the two farthest rooms were 1.6 km from the Hostel (luckily we were nicely helped by our host in going back and forth the places). In the end, some of the rooms would have had difficulties sitting 30 persons comfortably, so the number of attendees was well adapted.

We used an etherpad for sharing links and comments, but we again could not make the attendees actually contribute to it. I'm still wondering if we could do better on this aspect or if we should just accept this situation and consider it “normal”.

Given the summer context (I had less hard constraint from work), this 4-day format was very manageable and way more pleasant than a packed 2-day workshop. This time, there were also some available rooms in the CERN Hostel. This way, I could hang in CERN and I spend most of my mornings in the main cafeteria, which is a lovely place.

General Aspects and Core Lessons Coverage?

Most installations went quite well. Some students reported that the installation checker had been complaining about “EasyMercurial” and “Mayavi”. I'm wondering how many people it affects and whether they take a lot of time trying to fix it. I should try the window installer to get a better understanding of this.

Overall, we didn't cover much of the advanced subjects. The sticky notes feedback included the usual “two slow” and “too fast”, especially for the bash and shell. From feedback and observations, I guess both of us are typing relatively fast and we were sometimes not waiting enough, and it has been mentioned that having the online lesson to follow actually helped. We should also be more careful about not overwriting the cells in the notebook. The difficulty to find rooms was also mentioned on the post-its. Apart from that, the feedback was very positive and motivating.

The shell lesson went pretty well and we covered pipes, redirections, wildcards, loops, scripts but we did not have time to go over find and grep. In python we reached the functions and also covered in-between a little scripting for reproducibility outside the notebook (without sys.argv).

For the git lesson, seeing that some people were not speaking latex, we stuck to the normal lesson, using a plain text files. We could cover up to the .gitignore in the first slot. I found the repository-in-a-repository challenge somewhat confusing at the time. In the second part, we went up to creating and solving conflicts using repositories shared via github. We had a nice setup with two projectors to show our two screens and used the post-it method to associate a color to each role, pouting a post-it on ourselves and on our screens.

The workshop was the occasion to (re)discover the amazing excludesfile option for computer-wide gitignore (see this short post (3rd way)) which is very useful for emacs and MacOS users.

The “git checkout” Mess-Understanding

In the previous workshop, I taught the first git session and given that the audience knew latex, I improvised a lesson using an article written in latex. I had only followed the lesson script loosely but the lesson was extremely smooth.

This time, watching Kevin give the git lesson I realized how bad git checkout is to teach. I also noticed that I'm totally skipping git checkout as a means of recovering individual files when I teach git myself (apart from when I talk about branches).

I feel that explaining git checkout (to recover files from previous versions) is very tricky. In addition, as it happened to us, there is a risk of “detached head” state (you can recover from it with git checkout master), which is almost impossible to explain before talking about branches.

I also feel that git checkout could be removed or could be replaced from the lesson. Even though recovering a file is a motivating use case, I would totally remove it from the lesson. If we want to keep the feature, we could replace it by git show VERSION:file/name and some shell redirection. Let's open an issue to discuss about it and see what people think.

A first shot at a HTML/CSS/Javascript/D3 lesson

Kevin and I were the first to try and teach the D3 lesson (repo by IsaKiko). This lesson aims at covering basics of HTML, CSS and Javascript, and use these to make a live plot using the D3 library. We based our teaching on a branch I created just before the workshop but we still improvised a lot. Kevin also compiled a list of additional links that we put in the etherpad (see the end of this page).

The first time slot (before the break) was dedicated to HTML and CSS, including images and an intro to SVG. To give a better grasp of what is HTML/CSS, Kevin covered additional elements and properties compared to the lesson. I think this was a very good idea and he introduce meaningful constructs. He also introduced placekitten which is a must-know :). We got a feedback that HTML/CSS might be too much to cover for one time slot, but the overall feedback was very good.

On the editor side, the lesson rightly lists the “familiarity with an editor” as a prerequisite. Following the beginner workshop, students were mostly using nano, which, by default, is not amazing for web languages. We might have to improve our window installer to include proper handling of HTML/CSS/JS in nano.

About sharing with github, we did not follow the lesson content. Attendees already had collaborated in the git session and knew how to use git and push to github. We had each of them fork a repository we created on github, then clone it: this has been the occasion to introduce forking and pull requests. We also instructed them to create a new directory with their pseudo, so the merges would not conflict (it is not a real collaboration but it is a compromise). We all really liked the contribution graph showing the forks. One mistake I made, was to reuse the 2015-07-27-data repository that I had created to share data with participants. They ended up with in a somewhat confusing situation with two versions of the data repository (one cloned from mine at the beginning of the workshop and one cloned from their fork during the D3 lesson). Next time, I'll do a third repo.

In the second slot, I introduced javascript. As I'm rather slow when teaching, I had to skim a few elements. Seeing that the time was short, I skipped the part on feeding the cat: I quickly introduced <script>, Javascript, alert and console.log. Then I skipped the reference-by-value semantics of javascript and explained the JSON notation.

From there, I mostly followed the lesson on D3, but without rendering the axes and thus without having to create nested canvas and translate them. I explained why and how to manipulate DOM elements using D3 (in the end, I did not use getElementById). Then (or at the same time) I loaded the data file but I decided to use a function named dataProcessing, instead of an anonymous one, to better illustrate the control flow. I still used anonymous functions for attributes setting (.attr('cx', function(d) {return ...})). We could not reach the animation part but we explained that it is relatively easy to convert the code to an animated version.

In the data repository, we had provided a cat picture, a version of d3 (in case of sloppy network) and the nations.json data file. While it was a good idea, using a local file for nations.json actually caused some problems. Doing dynamic loading in my html presentations, I know the problem quite well: browsers are sometimes very secure when loading local URL (starting with file://). For the story, chrome will refuse to load a file (using javascript) from a local URL, firefox will allow only files in the current directory and sub-directories. These are actually safety features to avoid a local page to read and compromise your local private data.

The workaround we used was to load the version hosted by github, asking D3 to load this address: https://raw.githubusercontent.com/twitwi/2015-07-27-data/master/d3/nations.json that one can obtain by clicking on the “raw” button when exploring the file on github. Another easy solution would be to start a web server using Python.

This first occurrence of the HTML/CSS/D3 lesson was quite a good experience. We improvised around the lesson plan to fit in the afternoon, and the students found the session interesting, giving very positive feedback.

The convenient -one-page.html (and my -as-slides.html)

This time, I had a tablet on the side to follow the lessons more closely while giving them. I used the incredibly useful “one-page” versions made by Raniere in his nightly builds of the lessons. I also used his script to generate a one-page version of the D3 lessons.

Taking advantage of my train travel time, I also made a derivative from the one-page version. The goal is to have a slideshow for a lesson, containing one slide per figure, challenge, etc. I use it to quickly go through the lesson and also to show some of the figure/challenges to the learners. It is still a little rough but you can preview the result by adding -as-slides in the one-page version on my temporary swc site, e.g., http://dl.heeere.com/swc/git-novice/all-in-one-as-slides.html (press 'm' to see all slides).

What to do next (for attendees)

I received an email asking “how to upgrade the knowledge obtained during the workshop”. I made a long answer to this question, it would take too much space here and this might lead to a dedicated blog post.

Concluding Item Lists

Notes/Ideas for later:

  • don't forget the notes from previous workshop
  • think about how to improve the etherpad experience
  • start the second git lesson from the same laptop as the first one (it already has the repository)
  • anticipate problems with github two-factor authentication -> use SSH (need some keys) but have students use https
  • personally, remember that I'll get confused by matplotlib keeping an inner state

Links:

Any feedback or remarks? Contact me at click-me ;-p @nospam.com.