The news feed of www.michelepasin.orghttp://www.michelepasin.org/words/Latest articles, blogs posts and newsen-usFri, 27 Sep 2024 00:00:00 +0000Unpacking OpenAlex topics classificationhttps://www.michelepasin.org/blog/2024/09/27/open-alex-topics/ In this post I have taken a closer look at the classification of scientific disciplines in [OpenAlex](https://openalex.org/), a recently developed database of scientific works. The topics classification has been entirely generated computationally using a mix of citation clustering techniques and LLM-based labeling. The results, although not always so precise, are definitely worth exploring further. Last week I went to the [STI 2024 conference](https://sti2024.org/sti-conference/) in Berlin, the annual European get together of experts in the area of research analytics and evaluation. Lots of interesting talks but probably the thing that stroke me the most was the general excitement and sense of expectation about OpenAlex. If you haven't encountered yet, [OpenAlex](https://openalex.org/) is a 2022-released open database of research publications and other related content e.g. datasets, authors, journals etc.. developed by [OurResearch](https://ourresearch.org/). Pretty much all o ...Fri, 27 Sep 2024 00:00:00 +0000https://www.michelepasin.org/blog/2024/09/27/open-alex-topics/Dimensions: Calculating Disruption Indices at Scalehttps://www.michelepasin.org/papers/2023/09/13/dimensions-calculating-disruption-indices-at-scale/Evaluating the disruptive nature of academic ideas is a new area of research evaluation that moves beyond standard citation-based metrics by taking into account the broader citation context of publications or patents. The "CD index" and a number of related indicators have been proposed in order to characterise mathematically the disruptiveness of scientific publications or patents. This research area has generated a lot of attention in recent years, yet there is no general consensus on the significance and reliability of disruption indices. More experimentation and evaluation would be desirable, however is hampered by the fact that these indicators are expensive and time-consuming to calculate, especially if done at scale on large citation networks. We present a novel method to calculate disruption indices that leverages the Dimensions cloud-based research infrastructure and reduces the computational time taken to produce such indices by an order of magnitude, as well as making available such functionalities within an online environment that requires no set-up efforts. We explain the novel algorithm and describe how its results align with preexisting implementations of disruption indicators. This method will enable researchers to develop, validate and improve mathematical disruption models more quickly and with more precision, thus contributing to the development of this new research area.Fri, 06 Sep 2024 00:00:00 +0000https://www.michelepasin.org/papers/2023/09/13/dimensions-calculating-disruption-indices-at-scale/Designing great dashboards: a slidedeckhttps://www.michelepasin.org/blog/2023/07/06/designing-great-dashboards/ What makes a dashboard great? Here is a slide deck ([gslides](https://docs.google.com/presentation/d/e/2PACX-1vTQKTlvOtfXOKpnhdYJJEExUKf0sIh9cwiqu8SmUmU2NlhPEVOFxArj6hs77CuB8rKdUXG8om0IxKd-/pub?start=false&loop=false&delayms=3000) )that consolidates several useful ideas I've ran into in the past. After reading many useful papers and online resources on the topic of dashboards design, I realised I didn’t have a single document collecting and organising all of the useful ideas I encountered. So the purpose of this slide deck ([gslides](https://docs.google.com/presentation/d/e/2PACX-1vTQKTlvOtfXOKpnhdYJJEExUKf0sIh9cwiqu8SmUmU2NlhPEVOFxArj6hs77CuB8rKdUXG8om0IxKd-/pub?start=false&loop=false&delayms=3000) ) is to serve as a (work-in-progress) handbook a dashboards developer can get back to, in order to find inspiration, advice, and maybe, even endorsement. <iframe src="https://docs.google.com/presentation/d/e/2PACX-1vTQKTlvOtfXOKpnhdYJJEExUKf0sIh9cwiqu8SmUmU2NlhPEVOFxArj6hs77CuB8rKdUX ...Thu, 06 Jul 2023 00:00:00 +0000https://www.michelepasin.org/blog/2023/07/06/designing-great-dashboards/Notes from the book: Deep Work (2016)https://www.michelepasin.org/blog/2023/07/01/deep-work/ Finally got down to reading the book [Deep Work](https://www.worldcat.org/title/920740925) from Cal Newport (2016). The book central idea is that 'deep work' i.e. work based on prolonged stretches of focused time without distractions, has become largely underrated in today's always-on internet world. And that is not good. <!-- ![2023-07-05-notes-deep-work.png](/media/static/blog_img/2023-07-05-notes-deep-work.png) --> The book's argument didn't strike me as revolutionary, or particularly new. I think that anyone with some kind of advanced education (academic or not) knows exactly how important focused work is. ### Deep work is underrated This book is a convincing reminder of the fact that in many jobs *deep work is not seen as essential*, anymore. So we should make a conscious effort to make room for it, in our lives, and to get others to recognise its importance. *Yes - I am talking to you, business managers and time-suckers!* I saved a few passages from book that I felt ...Sat, 01 Jul 2023 00:00:00 +0000https://www.michelepasin.org/blog/2023/07/01/deep-work/Any sufficiently advanced technology is indistinguishable from magichttps://www.michelepasin.org/blog/2023/06/05/chatgpt-as-music/ [Arthur C Clarke](https://en.wikipedia.org/wiki/Clarke%27s_three_laws) once commented that "Any sufficiently advanced technology is indistinguishable from magic"  Today's [LLMs](https://en.wikipedia.org/wiki/Wikipedia:Large_language_models) get described like a baby that get magically fed the entire web’s worth of documents. The baby learns how words are associated together, can make sense of questions and can say words back to us with enormous dexterity.  But the baby hasn’t gone out in the real world a single minute.  It simply reproduces language **as if it was music**. Given an input melody, it spits out another melody that matches it, more or less, according to predefined parameters, and of course the input patterns. ### With LLMs, there is no world, just the music This is just an imitation game. It is designed to be like that. Music patterns in, musical patterns out. That’s where it derives its strength from and that’s why it appears so magical. It’s pretty damn good at imi ...Mon, 05 Jun 2023 00:00:00 +0000https://www.michelepasin.org/blog/2023/06/05/chatgpt-as-music/SciGraph 2017-2023https://www.michelepasin.org/blog/2023/02/03/rip-scigraph/ Springer Nature retired [SciGraph](https://www.springernature.com/gp/researchers/scigraph) earlier this month. I have been the data architect and then technical lead for this project, so this is post is just a reminder of the great things we did in it. Also, a little rant about the things that weren't that great... ## Open Linked Data for the Scholarly domain SciGraph has been running for almost 8 years. I've been involved with the project since its early days in 2016, together with [lots of enthusiastic people at Springer Nature](https://www.youtube.com/watch?v=HzzBuHy51wI). It started out as an attempt to break data silos about scientific publications. We chose [Linked Data](https://en.wikipedia.org/wiki/Linked_data) as its core technology for multiple reasons: its open standards and vibrant community, the expressive knowledge modeling languages, and last but not least the intent to support an increasing number of researchers/data-scientists who could independently [take advanta ...Fri, 24 Feb 2023 00:00:00 +0000https://www.michelepasin.org/blog/2023/02/03/rip-scigraph/Paperpile: a PDF manager with Google Drive backendhttps://www.michelepasin.org/blog/2023/01/19/introducing-paperpile/ [Paperpile](https://paperpile.com/) is an online PDF manager that stores your personal data in your Google Drive folder. I recenlty found out about it and discovered that it addresses the biggest issue I had with most of its competitors: the [vendor lock-in](https://en.wikipedia.org/wiki/Vendor_lock-in) problem. ![2023-01-20-paperpile-1.png](/media/static/blog_img/2023-01-20-paperpile-1.png) ## Organizing papers, hello old friend I recently started working on a new topic, collecting and organising academic papers to build a conceptual map of the area. So I began looking for a piece of software that could help with that task. This problem is not new to me. In the past I've used a lot [Mendeley](https://www.michelepasin.org/blog/2012/08/07/using-mendeley-and-dropbox-to-sync-your-pdf-library-across-computers/index.html), for this task, as well as its competitors [Readcube](https://app.readcube.com/) and [Papers](https://www.papersapp.com/). Frustrated by the lack of portability ...Thu, 19 Jan 2023 00:00:00 +0000https://www.michelepasin.org/blog/2023/01/19/introducing-paperpile/Ontospy version 2.0 releasedhttps://www.michelepasin.org/blog/2022/10/30/Ontospy-v2-released/ Version 2 of the library includes [SHACL](https://www.w3.org/TR/shacl/) support as well as various internal refactoring. [Ontospy](http://lambdamusic.github.io/Ontospy/) is an open source Python library and command line tool for working with vocabularies encoded in the RDF family of languages. It took months to get through this release.. so really glad it's finally happened. ## What's new in 2.0 Main improvements are: - Remove all Django dependencies, replaced with [Jinja2](https://jinja.palletsprojects.com/en/3.1.x/intro/#installation) - Drop support for python2 - Refactor code / clean up - Merged additional SHACL support branch [pull-107](https://github.com/lambdamusic/Ontospy/pull/107) - Fix error loading JSONLD graphs [issue-1416](https://github.com/lambdamusic/Ontospy/issues/102) - Rename internal `ontodocs` module to `gendocs` ## See also The [official documentation](http://lambdamusic.github.io/Ontospy/) ![2022-10-30-ontospy-v2.png](/media/static/blog_i ...Sun, 30 Oct 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/10/30/Ontospy-v2-released/Generating large-scale network analyses of scientific landscapes in seconds using Dimensions on Google BigQueryhttps://www.michelepasin.org/papers/2022/09/01/generating-largescale-network-analyses-of-scientific-landscapes-in-seconds-using-dimensions-on-google-bigquery/The growth of large, programatically accessible bibliometrics databases presents new opportunities for complex analyses of publication metadata. In addition to providing a wealth of information about authors and institutions, databases such as those provided by Dimensions also provide conceptual information and links to entities such as grants, funders and patents. However, data is not the only challenge in evaluating patterns in scholarly work: These large datasets can be challenging to integrate, particularly for those unfamiliar with the complex schemas necessary for accommodating such heterogeneous information, and those most comfortable with data mining may not be as experienced in data visualisation. Here, we present an open-source Python library that streamlines the process accessing and diagramming subsets of the Dimensions on Google BigQuery database and demonstrate its use on the freely available Dimensions COVID-19 dataset. We are optimistic that this tool will expand access to this valuable information by streamlining what would otherwise be multiple complex technical tasks, enabling more researchers to examine patterns in research focus and collaboration over time.Thu, 01 Sep 2022 00:00:00 +0000https://www.michelepasin.org/papers/2022/09/01/generating-largescale-network-analyses-of-scientific-landscapes-in-seconds-using-dimensions-on-google-bigquery/Bringing quotations back to lifehttps://www.michelepasin.org/blog/2022/07/28/introducing-quotes-section/ There's a new section on this site that allows to navigate quotations: [quotes.michelepasin.org](https://quotes.michelepasin.org). It's just a cut-down implementation of an [old idea](https://www.michelepasin.org/blog/2015/01/05/introducing-resquotes-com/index.html) I worked on a while ago, but you know.. sometimes it is useful to start from scratch and re-think things from the ground up. ### Why? These are quotes I've been collecting here and there, over the years, using various apps like [NVALT](https://brettterpstra.com/projects/nvalt/), [Notes](https://support.apple.com/en-gb/guide/notes/welcome/mac) or emails. The quotes have also been categorised a little using tags and titles. Since I hate to have stuff lying around on my hard drive and hardly being used, I've made a new [webapp](quotes.michelepasin.org) that allows to browse all of this content. Possibly, someone other than me can find it useful or inspiring. ### A bit of history A while ago, I built a webapp called ...Fri, 29 Jul 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/07/28/introducing-quotes-section/A semi-automated conference assistanthttps://www.michelepasin.org/blog/2022/06/30/a-semi-automated-conference-assistant/ A couple of weeks ago I went to the excellent [Move Or Perish—Scientific Trajectories, Inclusion, And Inequality, And Their Consequences For Transformative Science](https://www.csh.ac.at/event/csh-workshop-move-or-perish-scientific-trajectories-inclusion-and-inequality-and-their-consequences-for-transformative-science/) workshop in Vienna. While getting ready for it, I found myself asking some familiar questions. Who are the speakers? What is their background? How to best contextualise the topics being discussed? Nowadays scientists tend specialise in highly niche areas, so it doesn't take much for people to feel they are getting out of their confort zone, when attending a conference. So many times I wish I had an automated digital *conference assistant*. ### Brainstoming with the Dimensions API These are big question I know, but I wonder if a simple piece of software could help. To put it simply, a software that would sift through the available online information about the spe ...Thu, 30 Jun 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/06/30/a-semi-automated-conference-assistant/Exploring Bento noise boxhttps://www.michelepasin.org/blog/2022/05/29/bento-noise-box/ Improvised acid loops using [Extempore](https://extemporelang.github.io/) + [Bentō](https://www.giorgiosancristoforo.net/). > Bentō is a standalone noise box with tape recorder, inspired by the japanoise scene. Thanks to its unstable and very unique oscillators, Bentō can create an enormous number of sounds and impredictable noises that are not possible with traditional subctractive synthesizers. See the [PDF user manual](https://www.giorgiosancristoforo.net/downloads/Bento_User_Manual.pdf) ![bento-screenshot.jpg](/media/static/blog_img/bento-screenshot.jpg) ## Take 1 Just trying to control it using MIDI-CC from Extempore. Note: I previously created some MIDI mappings and saved them to a [file](https://github.com/lambdamusic/extempore-extensions/blob/main/init/init_bento.xtm) I can reload each time. <iframe width="560" height="315" src="https://www.youtube.com/embed/P6Av_eLy_xw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encr ...Sun, 29 May 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/05/29/bento-noise-box/Three things I do *not* like about Lookerhttps://www.michelepasin.org/blog/2022/04/20/Three-things-i-do-not-like-about-looker/ Following up on my previous [3 things I like about Looker](/blog/2022/03/02/Three-things-i-like-about-looker/) , here are instead the top three things that I really wish were different about this piece of software. > [Looker](https://www.looker.com/) is a business intelligence software and big data analytics platform that helps you explore, analyze and share real-time business analytics easily. Looker is part of the Google Cloud platform. ## 1. Can't make public dashboards I totally wish I was able to create a dashboard and make it available on the web without the need for users to log in. Instead: > To view the dashboard, anyone with the link must have access to the Looker instance on which the dashboard is saved, as well as access to the [dashboard](https://docs.looker.com/sharing-and-publishing/organizing-spaces#viewing_and_managing_access_for_a_folder) and [models](https://docs.looker.com/admin-options/settings/roles#model_sets) that the tiles are based on. Dashboard shar ...Wed, 20 Apr 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/04/20/Three-things-i-do-not-like-about-looker/Composition: 'Study for Cello and Double-bass'https://www.michelepasin.org/blog/2022/04/07/cellos-livecoding/ A new livecoding composition using [Extempore](https://extemporelang.github.io/) and Ableton Live: 'Study for Cello and Double-bass'. <iframe width="560" height="315" src="https://www.youtube.com/embed/VR6lMsECEQc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ## Creating chords using a cosine function The main technique used in this piece is to generate chord/harmonic variations using a cosine functions. ```scheme (at 8 0 (set! *melody* (:mkchord (:mkint 48 (cosrfloor 7 7 1/30) 'M) 'M (cosrfloor 7 3 1/5)) ) ``` Every 8 beats the root chord (used by all instruments in order to generate musical patterns) gets updated. Two cosine functions are used to simultaneously: 1. Determine the *amplitude* of the interval (major or minor, starting from C3) that generates the root note of the chord. 2. Determine the number of notes in the chord. The ...Thu, 07 Apr 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/04/07/cellos-livecoding/Three things I like about Lookerhttps://www.michelepasin.org/blog/2022/03/02/Three-things-i-like-about-looker/ Looker is a business intelligence and data visualization tool which was recenlty acquired by Google. After nearly 6 months of using Looker for building dashboards and visual analytics, here are the top 3 things I like about this platform. > [Looker](https://www.looker.com/) is a business intelligence software and big data analytics platform that helps you explore, analyze and share real-time business analytics easily. Looker is part of the Google Cloud platform. ## 1. LookML > [LookML](https://docs.looker.com/data-modeling/learning-lookml) is a language for describing dimensions, aggregates, calculations, and data relationships in a SQL database. Looker uses a model written in LookML to construct SQL queries against a particular database. LookML provides a dedicated modeling layer for your dashboard applications. Think of LookML objects as building blocks, which can be extended and combined together in different ways without repeating code. Compared to simply writing SQL que ...Wed, 02 Mar 2022 00:00:00 +0000https://www.michelepasin.org/blog/2022/03/02/Three-things-i-like-about-looker/A static site generator using Django, Wget and Github Pageshttps://www.michelepasin.org/blog/2021/10/29/django-wget-static-site/ If you're a Django developer and want to publish a website without the hassle (and costs) of deploying a web app, then this post may give you some useful tips. I found myself in this situation several times, so have created a time-saving workflow/set of tools for extracting a dynamic [Django](https://www.djangoproject.com/) website into a [static](https://en.wikipedia.org/wiki/Static_web_page) website (= a website that does not require a web application, just plain simple HTML pages). > Disclaimer: this method is not suited for all types of websites. EG if your Django application is updated frequently (e.g. more than once a day), or if it has keyword search (or faceted search) pages that inherently rely on dynamical queries to the Django back-end based on user input, then a static site won't cut it for you, most likely. In a nutshell - this is how it works: 1. On my computer, I create / edit the website contents using [Markdown](https://en.wikipedia.org/wiki/Markdown) as much a ...Fri, 05 Nov 2021 00:00:00 +0000https://www.michelepasin.org/blog/2021/10/29/django-wget-static-site/Terminal script: getting the time in different world time zoneshttps://www.michelepasin.org/blog/2021/10/12/world-date-terminal/ A little Bash script to show information about world time zones. Because I love my colleagues abroad, but I constantly struggle to remember how many hours ahead (or behind?) they are. I am using the script on a Mac, but it should work on other systems too with little or no changes. Basically, it scans the `zoneinfo` database ([more info](https://en.wikipedia.org/wiki/Tz_database)) that most likely already exists on your computer, in order to return the rows matching an input string. For example - what's the time in Australia right now? ```bash $ wdate australia /Australia/Melbourne Tue 2021-10-12 20:47:48 /Australia/Queensland Tue 2021-10-12 19:47:48 /Australia/North Tue 2021-10-12 19:17:48 /Australia/Lord_Howe Tue 2021-10-12 20:47:48 /Australia/Adelaide Tue 2021-10-12 20:17:48 /Australia/Yancowinna Tue 2021-10-12 20:17:48 /Australia/Victoria Tue 2021-10-12 20:47:48 /Australia/Canb ...Tue, 07 Sep 2021 00:00:00 +0000https://www.michelepasin.org/blog/2021/10/12/world-date-terminal/Recipe: Making a livecoding screencast with QuickTime and RecordIthttps://www.michelepasin.org/blog/2021/08/30/recordit-plugin/ This post shows how to make a livecoding screencast using free OSX technologies. Capturing system audio and screen-recording your live coding performance can be done in multiple ways. Here's a method based on Apple's [Quicktime](https://en.wikipedia.org/wiki/QuickTime) and an audio plugin that is part of a third-party software, [Record It](https://www.buildtoconnect.com/en/products/recordit). > Note: both of these software components are free. The [Record It Audio Device](https://www.buildtoconnect.com/downloads/RecordItAudioDevice.pkg) which is a free extension that enables you to capture system sounds on your Mac. It acts as a virtual audio input device and sends the sound from music, videos, and system alerts that you would normally hear through your speakers to the input cha ## Recording a screencast: steps 1. Get the [Record It audio plugin](https://www.buildtoconnect.com/help/how-to-record-system-audio) PS this is a free audio extension, even if it is part of a paid-for ...Mon, 30 Aug 2021 00:00:00 +0000https://www.michelepasin.org/blog/2021/08/30/recordit-plugin/Composition: 'Rhythmic Cycles' with Extemporehttps://www.michelepasin.org/blog/2021/04/10/livecoding-rhythmic-cycles/ A new livecoding composition using [Extempore](https://extemporelang.github.io/) and Ableton Live: 'Rhythmic Cycles'. <iframe width="560" height="315" src="https://www.youtube.com/embed/m3v8gRzROkU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ## Using 'map' to generate musical textures The gist of this experiment relies on the `map` function. Using `map` and lists of notes and offsets, it is possible to schedule repeated calls to the `play` note function: ``` scheme (map (lambda (x y z) (onbeat x 0 (play y z (* dur .9) 1)) ) times notes volumes ) ``` When the `map` pattern above gets repeated via a loop, changing the input parameters generates a texture of sounds with a touch of randomness. For example, some parameters to experiment with: - times can be shifted up or down by 1/4 beat or so - notes can be transposed using different chord struc ...Sat, 10 Apr 2021 00:00:00 +0000https://www.michelepasin.org/blog/2021/04/10/livecoding-rhythmic-cycles/Livecoding visual patterns with Hydrahttps://www.michelepasin.org/blog/2021/04/02/Hydra/ Had quite a lot of fun livecoding visual patterns with [Hydra](https://hydra.ojack.xyz/?sketch_id=eerie_ear_0) > Hydra is live code-able video synth and coding environment that runs directly in the browser. It is free and open-source and made for beginners and experts alike. ![hydra-shot.png](/media/static/blog_img/hydra-shot.png) ## Reusable snippets One of the coolest aspects of Hydra is that is is self-contained and browser based. A program can be easily shared, either as code or via a URL. Eg try [this link](https://hydra.ojack.xyz/?sketch_id=uX3CNPimomz79Tib) - which renders the following snippet: ``` // ee_5 . FUGITIVE GEOMETRY VHS . audioreactive shapes and gradients // e_e // @eerie_ear // s = () => shape(7.284).scrollX([-0.5, -0.2, 0.3, -0.1, -0.062].smooth(0.139).fast(0.049)).scrollY([0.25, -0.2, 0.3, -0.095, 0.2].smooth(0.453).fast(0.15)); // solid().add(gradient(3, 0.05).rotate(0.05, -0.2).posterize(0.56).contrast(0.016), [1, 0.541, 1, 0.5, 0.181, 0.6].smooth(0. ...Fri, 02 Apr 2021 00:00:00 +0000https://www.michelepasin.org/blog/2021/04/02/Hydra/