sachachua: 2019-09-16 Emacs news

Update: Added Paris meetup on Sept 17. Also, almost forgot: opensource.com is planning to feature Emacs articles in October. If you would like to contribute, please contact Seth Kenlon (skenlon@opensource.com).

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Hacker News, planet.emacslife.com, YouTube, the Emacs NEWS file and emacs-devel.

-1:-- 2019-09-16 Emacs news (Post Sacha Chua)--L0--C0--September 16, 2019 11:16 AM

Irreal: Exporting Only the Contents of An Org Subtree

Here’s a short Org-mode tip that, although quite old, I just learned about. Suppose you want to export an Org subtree but you want only the contents and not the header. This emacs-reddit question, which alerted me to the solution, gives a typical use case for such a capability.

It turns out that there are lots of ways of doing this but the easiest is to require ox-extra as explained here. Once you’ve gotten the proper packages loaded and configured, you need only add an :ignore tag to the header that you don’t want to export. This differs from using the (typical) :noexport tag in that the contents will still be exported.

Notice that the package you need to install is org-plus-contrib not ox-extra. Also note that you’ll need to have the Org repository in your list of ELPA repositories. You can do that with something like

(add-to-list 'package-archives '("org" . "https://orgmode.org/elpa/") t)

if you don’t already have it included.

I haven’t needed something like this very often but it’s nice to know how to do it when I do have the need.

-1:-- Exporting Only the Contents of An Org Subtree (Post jcs)--L0--C0--September 15, 2019 03:55 PM

Irreal: Dumb-jump 0.5.3

Back in 2017, I started using dumb-jump after seeing Mike Zamansky’s video demonstrating it. As I said at the time, it really hit the sweet spot for me. I have tried many times over my career to warm up to TAGS-based systems and I never could. They were always too fussy and required maintaining the TAGS file.

The wonderful thing about dumb-jump is that it just works and requires little or no configuration and no index files to keep up to date. It does its magic by leveraging one of the grep siblings to search for what you need. That sounds as if it could be slow and it might be for very large projects but by using a fast grep (ripgrep) I found it to be instantaneous while browsing the Unix v10 kernel sources.

Jack Angers has just released Version 0.5.3 of dumb-jump and added support for 10 more languages. That means that you can use dumb-jump with more than 40 languages. Unless you’re using something fairly obscure, dumb-jump will probably support it.

I was a little skeptical at first due to my bad experiences with trying to use TAGS but I’ve been delighted with dumb-jump and it’s now one of my favorite packages. If you’re an Emacs user you should give it a try. It doesn’t take any effort at all to install it and you can easily delete it if it doesn’t meet your needs.

-1:-- Dumb-jump 0.5.3 (Post jcs)--L0--C0--September 13, 2019 04:34 PM

Grant Rettke: Try Out Org2Blog v1.1.0 Using This Package

Right now you can’t try out an Org2Blog v1.1.0 package from MELPA because it isn’t yet building using Org2Blog’s new recipe. Eventually the pull request will get merged. Once it does I will push the changes. However, you can still try out v1.1.0 using a package. If you’ve been waiting to try Org2Blog v1.1.0 using … Continue reading "Try Out Org2Blog v1.1.0 Using This Package"
-1:-- Try Out Org2Blog v1.1.0 Using This Package (Post grant)--L0--C0--September 11, 2019 04:47 PM

sachachua: 2019-09-09 Emacs news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Hacker News, planet.emacslife.com, YouTube, the Emacs NEWS file and emacs-devel.

-1:-- 2019-09-09 Emacs news (Post Sacha Chua)--L0--C0--September 10, 2019 06:47 PM

Raimon Grau: Migrating from vim (proficent) to emacs

In HN's thread about 26.3 being released (Congrats!), there's this guy  explaining that being already quite confortable with vim, it's too much of a commitment to move to emacs.

I remember the frustration when coming to emacs from an advanced vimmer POV: no "yyp"? no ":%s/foo/bar"? imap? But here's what I answered him:

As this is not an overnight conversion and you are quite proficent with vim already, my advice is to:
- Get used to type "emacs file.txt" instead of "vim file.txt" in your console.
- Have a function in emacs that opens the current file in vim for those moments where you just want your trusted environment. Writing it by yourself is a good focused learning experience.
This way you'll decide (and balance) how much you want to learn every day, and little by little you'll find yourself using that function less and less.

And I think this goes very well with my other "bootstrap your emacs lisp learning".

It's the same approach to most of my development (and life) efforts.  It's a function of how bad do you need it, how fast do you want it, the compound interest of starting early, and how much it slows you down (or blocks you for doing other things).
-1:-- Migrating from vim (proficent) to emacs (Post Raimon Grau (noreply@blogger.com))--L0--C0--September 03, 2019 12:06 AM

Chen Bin (redguardtoo): Aspell 0.60.8 will have direct support for camelCase words

Kevin Atkinso told me this good news.

Currently I'm using Emacs Lisp to check camel case words.

The new option --camel-case from aspell will definitely speed up the whole process.

Minimum Emacs setup,

(setq ispell-program-name "aspell")
(setq-default ispell-extra-args '("--sug-mode=ultra"
                                  "--lang=en_US"))
;; Make sure new aspell is installed
(when (string-match-p "--camel-case"
                      (shell-command-to-string (concat ispell-program-name " --help")))
  (push "--camel-case" ispell-extra-args))

Optionally, you could read What's the best spell check setup in emacs.

-1:-- Aspell 0.60.8 will have direct support for camelCase words (Post Chen Bin)--L0--C0--September 01, 2019 01:13 PM

Marcin Borkowski: A simple tip with overlays and diffs

A few days ago I had an interesting problem. I had to resolve a particlarly nasty Git merge. It involved several files with lines’ lengths in the triple digits and a fair number of very small changes. Seeing those changes (in smerge-mode), even after refining the diffs, was tricky – there were many very small patches (sometimes two, sometimes four characters) of changed text and I was quite afraid that I would miss some of them. I searched for a command to “go to the next patch of changes”, but to no avail. Then I decided to write my own.
-1:-- A simple tip with overlays and diffs (Post)--L0--C0--August 31, 2019 11:17 AM

Grant Rettke: Are Load Hooks Always A Bad Idea?

No, They usually are. My own experience has only been to use them when I don’t understand the load order and want to “just make it work”. Very pragmatic and in the long run it has to change. According to the manual: Normally, well-designed Lisp programs should not use with-eval-after-load. Its great to be normal.
-1:-- Are Load Hooks Always A Bad Idea? (Post grant)--L0--C0--August 27, 2019 01:02 AM

Manuel Uberti: A pen and a notebook

Cal Newport’s Digital Minimalism has really changed my perspective on a lot of things in my life. Since I started my decluttering process I’ve noticed several benefits. In particular, I’m less prone to distractions and leisure has become time for engaging activities.

This new approach has prompted me to reconsider how I organise my days, weeks, and months. For a couple of years I’ve been using my custom variation of GTD, tracking everything with Org files in Emacs, and using Syncthing and Orgzly to keep it all synced with my phone. However, as I already explained when I wrote about Cal Newport’s book, my smartphone is not my best friend, and so the whole setup is actually of little use. And let’s be frank: sometimes I can leave Emacs alone.

I also write a lot, everyday. I write about films, note my ideas on the music I listen to and the books I read, and I write for this blog too. Hence I need a place which can be useful both for organising my life and for my basic urge to write. Enter Bullet Journal.

I recently rediscovered the pleasure of pen and paper when in Leuven for Heart of Clojure. Since I left the laptop at home on purpose, I spent almost every spare moment on my notebook, and much of the article on the conference came straight from its pages. As Ryder Carroll, the man behind Bullet Journal, points out in his book, the connection our brain establishes with pen and paper is completely different from the one with keyboard and monitor. For me it’s mostly been a feeling I sensed when writing on my notebook. My thoughts just came out with a spontaneity I’ve rarely enjoyed with the digital counterparts.

I won’t bother you with the details about Bullet Journal. The website has tutorials, videos, there is a book about it, so everything you need to know is out there. I picked this system because I find it simple and clear, modular and practical. Starting next month, I am going to fully migrate my GTD setup to Bullet Journal and once again step back to distance myself from pervasive technology consumption.

-1:-- A pen and a notebook (Post)--L0--C0--August 25, 2019 12:00 AM

Rubén Berenguel: SyncTeX and pdf-view-mode for emacs

Or, destiny is cruel
Back in the days of yore, when I was switching between my Windows machine and a Linux machine, I remember having SyncTeX active in my Windows machine. It was a wonderful experience: SyncTeX lets you click anywhere on a generated file from LaTeX and gets back to your editor, to the place generating where you clicked. This was extremely useful, specially later on when you need to adjust many formulas to fit and you need a bit of back-and-forth-ing.

Then I got a Mac, and since Preview is so handy I slowly forgot about SyncTeX. Time went on, and I merrily kept on editing LaTeX files as usual. I even managed to deliver my PhD dissertation a couple weeks ago, the formal speech will be in a month or two (come at your own risk). AucTeX’s preview saved most of the days, so I slowly even forgot SyncTeX existed. Shame on me indeed.

The other day I got an annotated PDF from one of my advisors, and I just couldn’t open the annotations. I tried all programs I had for Mac, and no luck: annotations weren’t showing, just saw the icons. Surveying for some command-line tool to extract annotations (just in case) I found pdf-tools, a replacement for DocView based on Poppler. It had the awesome ability of actually displaying annotations, with it it was pretty clear the annotations were broken in that PDF. I got a new set of PDFs from my advisor with the annotations in place, though. While waiting for it to arrive…

I saw SyncTeX was an option of pdf-tools. I had been using that, hadn’t I? So, I activated SyncTeX in AucTeX (it is TeX-source-correlate-method, see here) and indeed: I could have two frames, one with the actual LaTeX sources and the other with a PDF, and go from one to the other. Even hyperreferences in PDF work! See (well, click on the full-screen mode to see it larger or you won't see anything)!



Getting pdf-tools to work wasn’t incredibly tricky (given the hoops you need for some packages, sometimes). Just

brew install pdf-tools

and after reading

brew info pdf-tools

I was told to run

  emacs -Q --batch --eval "(package-install-file 
\"/usr/local/Cellar/pdf-tools/0.60/pdf-tools-0.60.tar\")"

and this does the trick (well, change emacs for your actual emacs, which likely is /Applications/Emacs.app/Contents/MacOS/Emacs) You’ll also need to add to your .emacs file (or temporarily in your *scratch* buffer)


(setenv "PKG_CONFIG_PATH"

(concat

"/usr/local/Cellar/zlib/1.2.8/lib/pkgconfig"

":"

"/usr/local/lib/pkgconfig:/opt/X11/lib/pkgconfig"))


(getenv "PKG_CONFIG_PATH")


and run

(pdf-tools-install)

as advised in the package’s README. And that's it, open a PDF and activate pdf-view-mode to check everything is in place. Well worth it!
-1:-- SyncTeX and pdf-view-mode for emacs (Post Ruben Berenguel (noreply@blogger.com))--L0--C0--August 15, 2019 09:15 AM

Sanel Zukan: ido-mode with eshell, shell, sql and more...

ido-mode doesn't provide completion for eshell, shell or sql-mode out of the box, and I wasn't been able to find any package with this option that is lightweight enough (Helm, I'm looking at you). So, after few attempts, here is the code for that:

(defun ido-eshell-comint-history ()
  "eshell & comint history with ido."
  (interactive)
  (if (or (member major-mode '(eshell-mode sql-interactive-mode))
          (derived-mode-p 'comint-mode))
     (let ((ring (if (eq major-mode 'eshell-mode)
                   eshell-history-ring
                   comint-input-ring)))
       (insert
        (ido-completing-read "History: "
                             (delete-dups
                              (ring-elements ring)))))
    (message "Unsupported mode")))

After evaluating the code, in eshell simply execute M-x ido-eshell-comint-history and magic will happen - you will get full eshell history managed by ido-mode. To make it more authentic, map it to C-c C-l, which is default sequence for completion in eshell.

The same can be applied for shell and sql-mode as well.

But, the things gets interesting here: (ido-eshell-comint-history) will pick up history for all modes inherided from comint-mode (shell is using comint-mode), so you will get history completion for python, lua, lisp and other inferior processes for free!

Just to add (commenting above code), although sql-mode is using comint-mode, it doesn't derive it so additional check is necessary. Also to enable history in sql-mode you will need to set variable sql-input-ring-file-name as well.

-1:-- ido-mode with eshell, shell, sql and more... (Post)--L0--C0--August 06, 2019 10:00 PM

Manuel Uberti: Open Magit from Ibuffer

It’s amazing how many packages for our beloved editor exist out there. ELPA and MELPA keep growing and growing, and curiosity is always pushing me out to look for a new package to try.

Last year I wrote about Eyebrowse, a cool workspace manager that has proven to be a reliable friend until recently. Eyebrowse is a helpful extension, but as it turns out the built-in Ibuffer is enough to make sense of all of the buffers available. I open it with C-x C-b, which is one of the most used key bindings in my everyday Emacs interactions.

Ibuffer presents a customizable list of buffers on which, much like Dired, we can apply different kinds of operations: filter, sort, group, mark, delete, bury, visit. Just press h in Ibuffer to get an idea.

Since I mainly work on projects versioned on Git, ibuffer-vc helps with grouping the buffers in a project-based fashion and operate on them. For instance, to quickly close a project and its related buffers, mark the project header for deletion with d and then press x.

One thing I was missing in Ibuffer was an integration with Magit. I wanted to open magit-status in the project the current buffer (i.e., the buffer where point is on) belongs to.

(defun mu-ibuffer-magit ()
  "Open `magit-status' for the current buffer."
  (interactive)
  (let ((buf (ibuffer-current-buffer t)))
    (magit-status (cdr (ibuffer-vc-root buf)))))

I’ve bound this function to v in ibuffer-mode-map. Note that v was previously calling ibuffer-do-view, so you may want to pick the key that suits you best.

-1:-- Open Magit from Ibuffer (Post)--L0--C0--August 06, 2019 12:00 AM

Marcin Borkowski: datefudge and agenda testing

Some time ago, a question was asked on the Org-mode mailing list about a specific kind of task in Org agenda. This made me think about debugging one’s agenda settings. I’ve already written about batch agenda, but one problem with agenda testing is that it is inherently stateful, in one of the worst ways – it depends on the notion of now. Debugging time-related stuff is hard. (Well, time-related stuff is hard, after all.) It would be great if you could just manipulate Emacs into thinking that the time is some day in the future (or in the past)… Well, actually, it can be done – and it’s easier than I thought.
-1:-- datefudge and agenda testing (Post)--L0--C0--August 05, 2019 05:04 PM

Shae Erisson: Open Source Hardware Hearing Aid Part 1

Why pay thousands for a hearing aid when you could pay $300?
-1:-- Open Source Hardware Hearing Aid Part 1 (Post Shae Erisson)--L0--C0--August 03, 2019 12:00 AM

Alex Schroeder: Use of Emacs

Two Years With Emacs as a CEO (and now CTO) is the followup to A CEO's Guide to Emacs. The author talks about life with Emacs.

So many cool quotes in that article!

On using a tool that isn’t designed to do a task (an application) but on using a tool help you do your task. Emacs is a tool, not an app.

Emacs is a portal into the power of the computer itself, and that is a rabbit hole worth descending. Its idioms are paths to discovering and creating your own, and that for me is the definition of creativity. One of the sad things about modern computing is that it is largely made up of black boxes with shiny interfaces that provide momentary gratification rather than real satisfaction. This makes us into consumers rather than creators/makers of technology. I don’t care who you are or what your background is; you can understand your computer, and you can make things with it.

On not using Emacs for shared todo items and calendar stuff:

I gave up on using Org-mode for to dos and the like, as I have to coordinate many meetings and calls every day with dozens of people, and I cannot ask the rest of the world to adapt to my choice of tools, nor do I have the time to transcribe or automate moving things to Org.

On not using Emacs for takin notes during meetings because it’s often better to use a pen (unless the point of the meeting is to write a document).

I also use a plain old pen for note-taking during meetings, as I find laptop/keyboard use in meetings to be rude and limiting to my ability to listen and think.

Ah yes. At work, I usually joke that Emacs is just my IRC client... 🙂

Tags:

-1:-- Use of Emacs (Post)--L0--C0--August 02, 2019 10:09 PM

emacsninja: Code Conversion Language

Update: I forgot that I did a brief analysis on this many years ago, using ROT13 as example.

Update: Noam Postavsky pointed out on #emacs that CCL is not turing-complete after all as a full simulation (as opposed to just interpreting a single line) requires an Emacs Lisp loop. This loop cannot be done in CCL itself as it doesn’t allow feeding its output back in as input. The I/O restrictions most likely put it into the weaker category of finite-state transducers.

Emacs is most famously, a re-imagination of a Lisp machine, with the Emacs Lisp byte-code interpreter being at its core. A lesser-known fact is that there’s two more byte-code interpreters in its C sources, one for compiled regular expressions and another designed for encoding and decoding text, known as Code Conversion Language (CCL). This blog post will focus on the latter as it’s largely gone unnoticed and hasn’t seen too much experimentation.

The CCL implementation is split into the byte-code interpreter (ccl.c) and compiler (ccl.el) parts. There is no official documentation other than comments and docstrings found in these files. From this I’ve learned that CCL programs are represented as integer vectors and that there’s a higher-level language compiling to them, described in the ccl-define-program docstring. By reading that information I’ve deduced the following:

  • The VM has eight integer-sized registers r0 to r7 and an instruction counter ic
  • Register r7 is used as a status register and may be clobbered at any time by an arithmetic operation
  • A CCL program can either be run on a string and return a string, alternatively it can be run standalone for side effects
  • The former mode requires you to provide a nine-element status vector representing the registers and instruction counter, the latter an eight-element status vector representing the registers only
  • As a side-effect, the status vector contains the new state of the registers and instruction counter after executing the program
  • The VM supports the standard C arithmetic, comparison and assignment operators
  • The language translates several control flow statements to equivalent goto statements, such as if, branch (look-up table) and loop with repeat inside
  • Statements are grouped by surrounding them with parentheses
  • When operating on a string, they are read in and written out in a serial fashion, no random access whatsoever
  • It’s possible to do a look-up on an array, translation table or hash table
  • There is a call operator, but no stack to save/restore arguments to/from, so you’ll have to come up with a calling convention fitting the available registers
  • Each CCL program specifies a magnification factor which determines the ratio between output and input string size

Armed with that knowledge I wrote some boiler plate code for experimentation:

;; -*- lexical-binding: t; -*-

(require 'ccl)

(defvar ccl-status (make-vector 8 0))
(defvar ccl-status+ic (make-vector 9 0))

(defun ccl-init-status (status args)
  (let ((i 0))
    (fillarray status 0)
    (dolist (arg args)
      (aset status i arg)
      (setq i (1+ i)))))

(defun ccl-run (program string &rest args)
  (let ((status ccl-status+ic))
    (ccl-init-status status args)
    (ccl-execute-on-string program status string)))

(defun ccl-run-pure (program &rest args)
  (let ((status ccl-status))
    (ccl-init-status status args)
    (ccl-execute program status)
    status))

There will be some benchmark numbers, none of these are to be taken seriously. Do your own benchmarks before using mine for decisions.

Hello World!

For starters I’ll focus on processing strings. The easiest possible program that still does something useful reads in output and writes it out as is:

(define-ccl-program ccl-identity
  '(1
    (loop
     (read r0)
     (write r0)
     (repeat))))

(ccl-run 'ccl-identity "Hello World!") ;=> "Hello World!"

Let’s go through that program carefully. The S-Expression starts with a magnification factor of 1, meaning that the output buffer should be as large as the input buffer. If it were zero, no I/O would be permitted in the first place, whereas a factor greater than one would allocate enough space to produce a string larger than the input.

The magnification factor is followed by a s-expression that’s executed until it’s done or an error occurred, such as there being no more input. It may be followed by another s-expression that’s executed after the main one, no matter whether it failed with an error or not.

ccl-identity uses a pattern that will come up a few more times in this blog post. It enters a loop, reads a character into the r0 register, writes out a character from the r0 register and jumps to the beginning of the loop. If there are no more characters left, the read operation fails and terminates the loop. Let’s spice things up by adding an extra processing step before writing out the character:

(define-ccl-program ccl-xor
  '(1
    (loop
     (read r1)
     (r1 ^= r0)
     (write r1)
     (repeat))))

(ccl-run 'ccl-xor "Secret" 42) ;=> "yOIXO^"
(ccl-run 'ccl-xor "yOIXO^" 42) ;=> "Secret"

XOR is the bread and butter operator in modern cryptography. A text can be encrypted by replacing each character with the result of XORing it against a secret byte, similarly it can be decrypted by applying the same transformation again. To pass the secret byte as an argument, I’ve placed it in the r0 register and read the string into the r1 register instead. On each iteration of the loop r1 is set to r1 ^ r0 and written out again.

More on translation

In the real world translating characters isn’t as simple as applying some arithmetic to them. Suppose I wanted to challenge the upcase built-in:

(define-ccl-program ccl-upcase
  '(1
    (loop
     (read r0)
     (if (r0 >= ?a)
         (if (r0 <= ?z)
             (r0 -= 32)))
     (write r0)
     (repeat))))

The processing step is a bit more involved this time. If the read-in character appears to be between the a and z characters, transform it by subtracting 32. Why 32? Take a look at an ASCII table and you’ll see that this is the distance between uppercase and lowercase letters. Unfortunately this implementation cannot challenge upcase as it fails to translate non-ASCII characters correctly and is slower than the real deal:

(ccl-run 'ccl-upcase "Hello World!") ;=> "HELLO WORLD!"
(ccl-run 'ccl-upcase "Mötley Crüe") ;=> "MöTLEY CRüE"
(benchmark 100000 '(ccl-run 'ccl-upcase "Hello World!"))
;; => "Elapsed time: 0.165250s (0.072059s in 1 GCs)"
(benchmark 100000 '(upcase "Hello World!"))
;; => "Elapsed time: 0.119050s (0.072329s in 1 GCs)"

Let’s try again with a different text transformation where I actually have a chance to win, ROT13:

(define-ccl-program ccl-rot13
  '(1
    (loop
     (read r0)
     (if (r0 >= ?a)
         (if (r0 <= ?z)
             ((r0 -= ?a)
              (r0 += 13)
              (r0 %= 26)
              (r0 += ?a))))
     (if (r0 >= ?A)
         (if (r0 <= ?Z)
             ((r0 -= ?A)
              (r0 += 13)
              (r0 %= 26)
              (r0 += ?A))))
     (write r0)
     (repeat))))

This time the program needs to recognize two different character ranges to process, lowercase and uppercase ASCII characters. In either case they’re translated to their position in the alphabet, rotated by 13, then translated back to ASCII again. Surprisingly enough, this is enough to beat both rot13-string and rot13-region:

(ccl-run 'ccl-rot13 "Hello World!") ;=> "Uryyb Jbeyq!"
(ccl-run 'ccl-rot13 (ccl-run 'ccl-rot13 "Hello World!"))
;; => "Hello World!"
(benchmark 100000 '(ccl-run 'ccl-rot13 "Hello World!"))
;; => "Elapsed time: 0.248791s (0.072622s in 1 GCs)"
(benchmark 100000 '(rot13-string "Hello World!"))
;; => "Elapsed time: 6.108861s (2.360862s in 32 GCs)"
(with-temp-buffer
  (insert "Hello World!")
  (benchmark 100000 '(rot13-region (point-min) (point-max))))
;; => "Elapsed time: 1.489205s (1.017631s in 14 GCs)"

I then tried to use translation tables for a final example of a “Vaporwave” converter, but failed. Funnily enough this mirrors my overall experience with Emacs, it’s easy to write fun things, but the moment one tries to write something useful, you discover it’s not fun and sometimes not even up to the task. At least it’s possible to salvage the translation tables and use them with translate-region instead, the built-in used by rot13-string and rot13-region:

(defvar ccl-vaporwave-table
  (make-translation-table-from-alist
   (cons '(?\s . 12288)
         (mapcar (lambda (i) (cons i (+ i 65248)))
                 (number-sequence 33 126)))))

(defun vaporwave-it (string)
  (with-temp-buffer
    (insert string)
    (translate-region (point-min) (point-max) ccl-vaporwave-table)
    (buffer-string)))

(vaporwave-it (upcase "aesthetic")) ;=> "AESTHETIC"

Edging towards general-purpose computing

All examples so far have worked on text. If you limit yourself to numbers, you can solve some basic arithmetic problems. Here’s a classic, calculating the factorial of a number:

(define-ccl-program ccl-factorial
  '(0
    ((r1 = 1)
     (loop
      (if r0
          ((r1 *= r0)
           (r0 -= 1)
           (repeat)))))))

(defun factorial (n)
  (let ((acc 1))
    (while (not (zerop n))
      (setq acc (* acc n))
      (setq n (1- n)))
    acc))

While the regular version is more concise, the logic is nearly the same in both. Here’s some numbers:

(aref (ccl-run-pure 'ccl-factorial 10) 1) ;=> 3628800
(factorial 10) ;=> 3628800
(benchmark 100000 '(ccl-run-pure 'ccl-factorial 10))
;; => "Elapsed time: 0.069063s"
(benchmark 100000 '(factorial 10))
;; => "Elapsed time: 0.080212s"

This isn’t nearly as much of a speed-up as I’ve hoped for. Perhaps CCL pays off more when doing arithmetic than for looping? Another explanation is that the Emacs Lisp byte-code compiler has an edge over CCL’s rather simple one. Here’s a more entertaining example, printing out the lyrics of 99 Bottles of Beer on the Wall:

(define-ccl-program ccl-print-bottle-count
  '(1
    (if (r0 < 10)
        (write (r0 + ?0))
      ((write ((r0 / 10) + ?0))
       (write ((r0 % 10) + ?0))))))

(define-ccl-program ccl-99-bottles
  '(1
    (loop
     (if (r0 > 2)
         ((call ccl-print-bottle-count)
          (write " bottles of beer on the wall, ")
          (call ccl-print-bottle-count)
          (write " bottles of beer.\n")
          (write "Take one down and pass it around, ")
          (r0 -= 1)
          (call ccl-print-bottle-count)
          (write " bottles of beer on the wall.\n\n")
          (repeat))
       ((write "2 bottles of beer on the wall, 2 bottles of beer.\n")
        (write "Take one down and pass it around, 1 bottle of beer on the wall.\n\n")
        (write "1 bottle of beer on the wall, 1 bottle of beer.\n")
        (write "Take one down and pass it around, no more bottles of beer on the wall.\n\n")
        (write "No more bottles of beer on the wall, no more bottles of beer.\n")
        (write "Go to the store and buy some more, 99 bottles of beer on the wall.\n"))))))

(defun 99-bottles ()
  (with-output-to-string
    (let ((i 99))
      (while (> i 2)
        (princ (format "%d bottles of beer on the wall, %d bottles of beer.\n" i i))
        (princ (format "Take one down and pass it around, %d bottles of beer on the wall.\n\n" (1- i)))
        (setq i (- i 1))))
    (princ "2 bottles of beer on the wall, 2 bottles of beer.\n")
    (princ "Take one down and pass it around, 1 bottle of beer on the wall.\n\n")
    (princ "1 bottle of beer on the wall, 1 bottle of beer.\n")
    (princ "Take one down and pass it around, no more bottles of beer.\n\n")
    (princ "No more bottles of beer on the wall, no more bottles of beer.\n")
    (princ "Go to the store and buy some more, 99 bottles of beer on the wall.\n")))

This example shows a few more interesting things, generating text of unknown length is rather hard, so I’m using the standard magnification factor of 1 and estimate how big the buffer will be to create an appropriately sized input string. call is useful to not repeat yourself, at the cost of having to carefully plan register usage. Printing out the bottle count can be done if you’re limiting yourself to whole numbers up to 100, a generic solution is going to be hard without random access to the output string. The performance numbers for this one are somewhat surprising:

(let ((input (make-string 15000 ?\s)))
  (benchmark 1000 '(ccl-run 'ccl-99-bottles input 99)))
;; => "Elapsed time: 0.301170s (0.217804s in 3 GCs)"
(benchmark 1000 '(my-99-bottles))
;; => "Elapsed time: 1.735386s (0.507231s in 7 GCs)"

This doesn’t make much sense. Is using format that expensive? It’s hard to tell in advance whether CCL will make a noticable difference or not.

But is it Turing-complete?

My experimentation so far left me wondering, is this language turing-complete? You can perform arithmetics, there’s goto, but the I/O facilities, amount of registers and memory access are limited. The easiest way of proving this property is by implementing another known turing-complete system on top of your current one. I researched a bit and found the following candidates:

  • Brainfuck: A classic, however it requires writable memory. Registers could be used for this, but you don’t have many to play with. You’d need the branch instruction to simulate the data pointer.
  • subleq: Implementing subleq looks easy, but suffers from the same problem as Brainfuck, it requires you to modify an arbitrary memory location. I’ve found a compiler from a C subset to subleq that generates code operating beyond the handful of registers, so that’s not an option either.
  • Rule 110: It’s basically Game of Life, but one-dimensional and can be implemented in a serial fashion. With some tricks it doesn’t require random access either. The proof of it being turing-complete looks painful, but whatever, I don’t care. It’s perfect. There are more elementary cellular automata, so I’ll try to implement it in a generic fashion and demonstrate it on Rule 90 which produces the Sierpinski triangle.
(defmacro define-ccl-automaton (n)
  (let ((print-sym (intern (format "ccl-rule%d-print" n)))
        (rule-sym (intern (format "ccl-rule%d" n))))
    `(progn
       (define-ccl-program ,print-sym
         '(1
           ((r4 = 0)
            (if (r0 == ?1)
                (r4 += 4))
            (if (r1 == ?1)
                (r4 += 2))
            (if (r2 == ?1)
                (r4 += 1))
            (branch r4
                    (write ,(if (zerop (logand n 1)) ?0 ?1))
                    (write ,(if (zerop (logand n 2)) ?0 ?1))
                    (write ,(if (zerop (logand n 4)) ?0 ?1))
                    (write ,(if (zerop (logand n 8)) ?0 ?1))
                    (write ,(if (zerop (logand n 16)) ?0 ?1))
                    (write ,(if (zerop (logand n 32)) ?0 ?1))
                    (write ,(if (zerop (logand n 64)) ?0 ?1))
                    (write ,(if (zerop (logand n 128)) ?0 ?1))))))
       (define-ccl-program ,rule-sym
         '(1
           ((r6 = ,n)
            (r0 = 0)
            (read r1)
            (read r2)
            (loop
             (call ,print-sym)
             (read r3)
             (r0 = r1)
             (r1 = r2)
             (r2 = r3)
             (repeat)))
           ((r0 = r1)
            (r1 = r2)
            (r2 = r5)
            (call ,print-sym)))))))

(define-ccl-automaton 30)
(define-ccl-automaton 90)
(define-ccl-automaton 110)

(defun ccl-sierpinski ()
  (with-output-to-string
    (let ((line "0000000001000000000"))
      (dotimes (_ 20)
        (princ line)
        (terpri)
        (setq line (ccl-run 'ccl-rule90 line))))))

The macro may look scary, but all it does is defining two CCL programs. What an elementary cellular automaton does is looking at the two cells around the current cell, map them to a cell depending to a rule and emit it. There are two edge cases with this for the first and last cell, in my implementation the first cell assumes the previous one was a zero and the last cell uses the first cell. Since there’s no random access, it’s stored into a spare register at the beginning and accessed in a S-Expression after the main loop terminated due to no more input. The surrounding and current cell are stored in three registers and rotated every time a new cell is read in. The mapping is done in the print program by summing up the ones and zeroes, then using the branch instruction to apply the rule to it. If you find this hard to follow, here’s an Emacs Lisp version of it using random access and less limited arithmetic to do the job:

(defun rule--evolve (prev cur next n)
  (let ((acc (+ (if (= prev ?1) 4 0)
                (if (= cur ?1) 2 0)
                (if (= next ?1) 1 0))))
    (if (zerop (logand n (ash 1 acc))) ?0 ?1)))

(defun rule-evolve (line n)
  (let ((out (make-string (length line) ?0)))
    (dotimes (i (length line))
      (cond
       ((zerop i)
        (aset out i (rule--evolve ?0 (aref line i) (aref line (1+ i)) n)))
       ((= i (1- (length line)))
        (aset out i (rule--evolve (aref line (1- i)) (aref line i) (aref line 0) n)))
       (t
        (aset out i (rule--evolve (aref line (1- i)) (aref line i) (aref line (1+ i)) n)))))
    out))

(defun sierpinski ()
  (with-output-to-string
    (let ((line "0000000001000000000"))
      (dotimes (_ 20)
        (princ line)
        (terpri)
        (setq line (rule-evolve line 90))))))

One more benchmark run, this time with less surprising performance numbers:

(benchmark 1000 '(ccl-sierpinski))
;; => "Elapsed time: 0.365031s (0.071827s in 1 GCs)"
(benchmark 1000 '(sierpinski))
;; => "Elapsed time: 0.545512s (0.071829s in 1 GCs)"

If you want to see it in action, try evaluating (progn (princ (sierpinski)) nil) in M-x ielm.

Now for a big letdown, despite everything what I’ve demonstrated, this system is not turing-complete after all. While it’s capable of processing a single line of input, the proof of Rule 110 being turing-complete relies on feeding its output in as input over and over again, however that part has been done in Emacs Lisp as it’s impossible to do in CCL. I’m not 100% sure what CCL’s computing power is, Noam Postavsky suggested on #emacs that it’s most likely a finite-state transducer.

Final words

You may ask yourself now whether you should rewrite all of your slow code to CCL programs. I don’t believe that’s the way to go for a number of reasons:

  • Speedups vary wildly, somewhere between -40% to 450%. There’s no obvious way of predicting whether the optimization is worth it.
  • It’s hard to write an equivalent program or sometimes even impossible, especially if it requires you to use many variables or random access read/write operations.
  • It’s hard to debug a CCL program. While the compiler does a good job at catching syntax errors, runtime errors are far harder to figure out if you can only stare at the registers. Maybe a debugger could be built that uses the “continue” facility and instruction counter register…
  • It’s hard to maintain a CCL program. Not to mention, there’s hardly people who know how to write them. Most of the examples I’ve found online do text encoding/decoding. The only exception to this is pgg-parse-crc24-string which lives in a file marked as obsolete.
  • This leads me to my last point, CCL may be obsoleted as well. Granted, it will take time, but so far I haven’t seen people enthusiastic about it being a thing.

If you still believe that despite of this it’s worth giving CCL a try, please let me know, especially if you’re doing something non-standard with it, like advanced cryptography or number crunching. Likewise, if you’re not convinced that it’s turing-complete or that I could be doing some things far better than presented, send me a message.

-1:-- Code Conversion Language (Post Vasilij Schneidermann)--L0--C0--July 23, 2019 06:37 PM

Chen Bin (redguardtoo): Javascript code navigation in counsel-etags

Javascript code navigation is supported by counsel-etags out of box.

It can support new javascript syntax like arrow function easily because counsel-etags is just frontend.

It reads tags file created by backend CLI program Ctags. Ctags uses regular expression to extract tag name from source code.

But there are some syntax which regular expression could not help.

For example, json object path can't be extracted by regular expression.

Given an object a in file A,

var a = {
  b: {
    c: 3,
  }
};

File B has code let v1 = a.b.c;, how can we jump to the definition of the field c from json path a.b.c?

The solution is to use Lisp to parse code in file A and generate extra navigation data which could be appended to tags file generated by Ctags.

The algorithm is simple,

  • Traverse all the field of object a in file A. Use API js2-print-json-path from js2-mode to get json path of current field.
  • The json path could be regarded as tags name. We've already got file name and line number. So there is enough information to create navigation data for tags file. Here is tags file format.

Necessary utilities are already provided by counsel-etags v1.8.7,

  • After tags files is generated by Ctags, the hook counsel-etags-after-update-tags-hook is executed. Users can append tags file in this hook
  • (counsel-etags-tag-line code-snippet tag-name line-number byte-offset) return a line which could be appended into tags file

My current project uses a technology called styled-components which has an advanced feature theming.

It could dynamically change web application's appearance and is a critical technology for our application to support multiple customer. Application's theme is basically a file containing a huge json object. So it's important that developers can jump to the corresponding json object's field by json path.

Screencast

counsel-etags-plus-json-path.gif

Code

(require 'counsel-etags)

(defun my-manual-update-tag-file (code-file tags-file)
  (let* ((dir (file-name-directory tags-file))
         (path (concat dir code-file))
         curline
         jp
         tagstr)
    (unless (featurep 'js2-mode) (require 'js2-mode))
    (with-temp-buffer
      (insert-file-contents path)
      (js2-init-scanner)
      (js2-do-parse)
      (goto-char (point-min))
      ;; find all js object property names
      (while (re-search-forward "\"?[a-z][a-zA-Z0-9]*\"?:" (point-max) t)
        (when (setq jp (js2-print-json-path))
          (setq curline (string-trim (buffer-substring-no-properties (line-beginning-position)
                                                                     (line-end-position))))
          (setq tagstr (concat tagstr
                               (counsel-etags-tag-line curline
                                                       jp
                                                       (count-lines 1 (point))
                                                       (point)))))
        ;; move focus to next search
        (goto-char (line-end-position))))
    (when tagstr
      (with-temp-buffer
        (insert-file-contents tags-file)
        (goto-char (line-end-position))
        (insert (format "\n\014\n%s,%d\n%s" code-file 0 tagstr))
        (write-region (point-min) (point-max) tags-file nil :silent)))))

(defun counsel-etags-after-update-tags-hook-setup (tags-file)
    (my-manual-update-tag-file "frontend/theming/themes/darkTheme.js" tags-file)
    (my-manual-update-tag-file "frontend/theming/themes/lightTheme.js" tags-file))
(add-hook 'counsel-etags-after-update-tags-hook 'counsel-etags-after-update-tags-hook-setup)
-1:-- Javascript code navigation in counsel-etags (Post Chen Bin)--L0--C0--July 22, 2019 12:07 PM

Shae Erisson: Hyper and Super on my keyboard?

Stranger Strings or the new-ish way to redefine keyboard layouts in the X Window System
-1:-- Hyper and Super on my keyboard? (Post Shae Erisson)--L0--C0--July 21, 2019 12:00 AM

emacsninja: Code Conversion Language

Update: I forgot that I did a brief analysis on this many years ago, using ROT13 as example.

Emacs is most famously, a re-imagination of a Lisp machine, with the Emacs Lisp byte-code interpreter being at its core. A lesser-known fact is that there’s two more byte-code interpreters in its C sources, one for compiled regular expressions and another designed for encoding and decoding text, known as Code Conversion Language (CCL). This blog post will focus on the latter as it’s largely gone unnoticed and hasn’t seen too much experimentation.

The CCL implementation is split into the byte-code interpreter (ccl.c) and compiler (ccl.el) parts. There is no official documentation other than comments and docstrings found in these files. From this I’ve learned that CCL programs are represented as integer vectors and that there’s a higher-level language compiling to them, described in the ccl-define-program docstring. By reading that information I’ve deduced the following:

  • The VM has eight integer-sized registers r0 to r7 and an instruction counter ic
  • Register r7 is used as a status register and may be clobbered at any time by an arithmetic operation
  • A CCL program can either be run on a string and return a string, alternatively it can be run standalone for side effects
  • The former mode requires you to provide a nine-element status vector representing the registers and instruction counter, the latter an eight-element status vector representing the registers only
  • As a side-effect, the status vector contains the new state of the registers and instruction counter after executing the program
  • The VM supports the standard C arithmetic, comparison and assignment operators
  • The language translates several control flow statements to equivalent goto statements, such as if, branch (look-up table) and loop with repeat inside
  • Statements are grouped by surrounding them with parentheses
  • When operating on a string, they are read in and written out in a serial fashion, no random access whatsoever
  • It’s possible to do a look-up on an array, translation table or hash table
  • There is a call operator, but no stack to save/restore arguments to/from, so you’ll have to come up with a calling convention fitting the available registers
  • Each CCL program specifies a magnification factor which determines the ratio between output and input string size

Armed with that knowledge I wrote some boiler plate code for experimentation:

;; -*- lexical-binding: t; -*-

(require 'ccl)

(defvar ccl-status (make-vector 8 0))
(defvar ccl-status+ic (make-vector 9 0))

(defun ccl-init-status (status args)
  (let ((i 0))
    (fillarray status 0)
    (dolist (arg args)
      (aset status i arg)
      (setq i (1+ i)))))

(defun ccl-run (program string &rest args)
  (let ((status ccl-status+ic))
    (ccl-init-status status args)
    (ccl-execute-on-string program status string)))

(defun ccl-run-pure (program &rest args)
  (let ((status ccl-status))
    (ccl-init-status status args)
    (ccl-execute program status)
    status))

There will be some benchmark numbers, none of these are to be taken seriously. Do your own benchmarks before using mine for decisions.

Hello World!

For starters I’ll focus on processing strings. The easiest possible program that still does something useful reads in output and writes it out as is:

(define-ccl-program ccl-identity
  '(1
    (loop
     (read r0)
     (write r0)
     (repeat))))

(ccl-run 'ccl-identity "Hello World!") ;=> "Hello World!"

Let’s go through that program carefully. The S-Expression starts with a magnification factor of 1, meaning that the output buffer should be as large as the input buffer. If it were zero, no I/O would be permitted in the first place, whereas a factor greater than one would allocate enough space to produce a string larger than the input.

The magnification factor is followed by a s-expression that’s executed until it’s done or an error occurred, such as there being no more input. It may be followed by another s-expression that’s executed after the main one, no matter whether it failed with an error or not.

ccl-identity uses a pattern that will come up a few more times in this blog post. It enters a loop, reads a character into the r0 register, writes out a character from the r0 register and jumps to the beginning of the loop. If there are no more characters left, the read operation fails and terminates the loop. Let’s spice things up by adding an extra processing step before writing out the character:

(define-ccl-program ccl-xor
  '(1
    (loop
     (read r1)
     (r1 ^= r0)
     (write r1)
     (repeat))))

(ccl-run 'ccl-xor "Secret" 42) ;=> "yOIXO^"
(ccl-run 'ccl-xor "yOIXO^" 42) ;=> "Secret"

XOR is the bread and butter operator in modern cryptography. A text can be encrypted by replacing each character with the result of XORing it against a secret byte, similarly it can be decrypted by applying the same transformation again. To pass the secret byte as an argument, I’ve placed it in the r0 register and read the string into the r1 register instead. On each iteration of the loop r1 is set to r1 ^ r0 and written out again.

More on translation

In the real world translating characters isn’t as simple as applying some arithmetic to them. Suppose I wanted to challenge the upcase built-in:

(define-ccl-program ccl-upcase
  '(1
    (loop
     (read r0)
     (if (r0 >= ?a)
         (if (r0 <= ?z)
             (r0 -= 32)))
     (write r0)
     (repeat))))

The processing step is a bit more involved this time. If the read-in character appears to be between the a and z characters, transform it by subtracting 32. Why 32? Take a look at an ASCII table and you’ll see that this is the distance between uppercase and lowercase letters. Unfortunately this implementation cannot challenge upcase as it fails to translate non-ASCII characters correctly and is slower than the real deal:

(ccl-run 'ccl-upcase "Hello World!") ;=> "HELLO WORLD!"
(ccl-run 'ccl-upcase "Mötley Crüe") ;=> "MöTLEY CRüE"
(benchmark 100000 '(ccl-run 'ccl-upcase "Hello World!"))
;; => "Elapsed time: 0.165250s (0.072059s in 1 GCs)"
(benchmark 100000 '(upcase "Hello World!"))
;; => "Elapsed time: 0.119050s (0.072329s in 1 GCs)"

Let’s try again with a different text transformation where I actually have a chance to win, ROT13:

(define-ccl-program ccl-rot13
  '(1
    (loop
     (read r0)
     (if (r0 >= ?a)
         (if (r0 <= ?z)
             ((r0 -= ?a)
              (r0 += 13)
              (r0 %= 26)
              (r0 += ?a))))
     (if (r0 >= ?A)
         (if (r0 <= ?Z)
             ((r0 -= ?A)
              (r0 += 13)
              (r0 %= 26)
              (r0 += ?A))))
     (write r0)
     (repeat))))

This time the program needs to recognize two different character ranges to process, lowercase and uppercase ASCII characters. In either case they’re translated to their position in the alphabet, rotated by 13, then translated back to ASCII again. Surprisingly enough, this is enough to beat both rot13-string and rot13-region:

(ccl-run 'ccl-rot13 "Hello World!") ;=> "Uryyb Jbeyq!"
(ccl-run 'ccl-rot13 (ccl-run 'ccl-rot13 "Hello World!"))
;; => "Hello World!"
(benchmark 100000 '(ccl-run 'ccl-rot13 "Hello World!"))
;; => "Elapsed time: 0.248791s (0.072622s in 1 GCs)"
(benchmark 100000 '(rot13-string "Hello World!"))
;; => "Elapsed time: 6.108861s (2.360862s in 32 GCs)"
(with-temp-buffer
  (insert "Hello World!")
  (benchmark 100000 '(rot13-region (point-min) (point-max))))
;; => "Elapsed time: 1.489205s (1.017631s in 14 GCs)"

I then tried to use translation tables for a final example of a “Vaporwave” converter, but failed. Funnily enough this mirrors my overall experience with Emacs, it’s easy to write fun things, but the moment one tries to write something useful, you discover it’s not fun and sometimes not even up to the task. At least it’s possible to salvage the translation tables and use them with translate-region instead, the built-in used by rot13-string and rot13-region:

(defvar ccl-vaporwave-table
  (make-translation-table-from-alist
   (cons '(?\s . 12288)
         (mapcar (lambda (i) (cons i (+ i 65248)))
                 (number-sequence 33 126)))))

(defun vaporwave-it (string)
  (with-temp-buffer
    (insert string)
    (translate-region (point-min) (point-max) ccl-vaporwave-table)
    (buffer-string)))

(vaporwave-it (upcase "aesthetic")) ;=> "AESTHETIC"

Edging towards general-purpose computing

All examples so far have worked on text. If you limit yourself to numbers, you can solve some basic arithmetic problems. Here’s a classic, calculating the factorial of a number:

(define-ccl-program ccl-factorial
  '(0
    ((r1 = 1)
     (loop
      (if r0
          ((r1 *= r0)
           (r0 -= 1)
           (repeat)))))))

(defun factorial (n)
  (let ((acc 1))
    (while (not (zerop n))
      (setq acc (* acc n))
      (setq n (1- n)))
    acc))

While the regular version is more concise, the logic is nearly the same in both. Here’s some numbers:

(aref (ccl-run-pure 'ccl-factorial 10) 1) ;=> 3628800
(factorial 10) ;=> 3628800
(benchmark 100000 '(ccl-run-pure 'ccl-factorial 10))
;; => "Elapsed time: 0.069063s"
(benchmark 100000 '(factorial 10))
;; => "Elapsed time: 0.080212s"

This isn’t nearly as much of a speed-up as I’ve hoped for. Perhaps CCL pays off more when doing arithmetic than for looping? Another explanation is that the Emacs Lisp byte-code compiler has an edge over CCL’s rather simple one. Here’s a more entertaining example, printing out the lyrics of 99 Bottles of Beer on the Wall:

(define-ccl-program ccl-print-bottle-count
  '(1
    (if (r0 < 10)
        (write (r0 + ?0))
      ((write ((r0 / 10) + ?0))
       (write ((r0 % 10) + ?0))))))

(define-ccl-program ccl-99-bottles
  '(1
    (loop
     (if (r0 > 2)
         ((call ccl-print-bottle-count)
          (write " bottles of beer on the wall, ")
          (call ccl-print-bottle-count)
          (write " bottles of beer.\n")
          (write "Take one down and pass it around, ")
          (r0 -= 1)
          (call ccl-print-bottle-count)
          (write " bottles of beer on the wall.\n\n")
          (repeat))
       ((write "2 bottles of beer on the wall, 2 bottles of beer.\n")
        (write "Take one down and pass it around, 1 bottle of beer on the wall.\n\n")
        (write "1 bottle of beer on the wall, 1 bottle of beer.\n")
        (write "Take one down and pass it around, no more bottles of beer on the wall.\n\n")
        (write "No more bottles of beer on the wall, no more bottles of beer.\n")
        (write "Go to the store and buy some more, 99 bottles of beer on the wall.\n"))))))

(defun 99-bottles ()
  (with-output-to-string
    (let ((i 99))
      (while (> i 2)
        (princ (format "%d bottles of beer on the wall, %d bottles of beer.\n" i i))
        (princ (format "Take one down and pass it around, %d bottles of beer on the wall.\n\n" (1- i)))
        (setq i (- i 1))))
    (princ "2 bottles of beer on the wall, 2 bottles of beer.\n")
    (princ "Take one down and pass it around, 1 bottle of beer on the wall.\n\n")
    (princ "1 bottle of beer on the wall, 1 bottle of beer.\n")
    (princ "Take one down and pass it around, no more bottles of beer.\n\n")
    (princ "No more bottles of beer on the wall, no more bottles of beer.\n")
    (princ "Go to the store and buy some more, 99 bottles of beer on the wall.\n")))

This example shows a few more interesting things, generating text of unknown length is rather hard, so I’m using the standard magnification factor of 1 and estimate how big the buffer will be to create an appropriately sized input string. call is useful to not repeat yourself, at the cost of having to carefully plan register usage. Printing out the bottle count can be done if you’re limiting yourself to whole numbers up to 100, a generic solution is going to be hard without random access to the output string. The performance numbers for this one are somewhat surprising:

(let ((input (make-string 15000 ?\s)))
  (benchmark 1000 '(ccl-run 'ccl-99-bottles input 99)))
;; => "Elapsed time: 0.301170s (0.217804s in 3 GCs)"
(benchmark 1000 '(my-99-bottles))
;; => "Elapsed time: 1.735386s (0.507231s in 7 GCs)"

This doesn’t make much sense. Is using format that expensive? It’s hard to tell in advance whether CCL will make a noticable difference or not.

But is it Turing-complete?

My experimentation so far left me wondering, is this language turing-complete? You can perform arithmetics, there’s goto, but the I/O facilities, amount of registers and memory access are limited. The easiest way of proving this property is by implementing another known turing-complete system on top of your current one. I researched a bit and found the following candidates:

  • Brainfuck: A classic, however it requires writable memory. Registers could be used for this, but you don’t have many to play with. You’d need the branch instruction to simulate the data pointer.
  • subleq: Implementing subleq looks easy, but suffers from the same problem as Brainfuck, it requires you to modify an arbitrary memory location. I’ve found a compiler from a C subset to subleq that generates code operating beyond the handful of registers, so that’s not an option either.
  • Rule 110: It’s basically Game of Life, but one-dimensional and can be implemented in a serial fashion. With some tricks it doesn’t require random access either. The proof of it being turing-complete looks painful, but whatever, I don’t care. It’s perfect. There are more elementary cellular automata, so I’ll try to implement it in a generic fashion and demonstrate it on Rule 90 which produces the Sierpinski triangle.
(defmacro define-ccl-automaton (n)
  (let ((print-sym (intern (format "ccl-rule%d-print" n)))
        (rule-sym (intern (format "ccl-rule%d" n))))
    `(progn
       (define-ccl-program ,print-sym
         '(1
           ((r4 = 0)
            (if (r0 == ?1)
                (r4 += 4))
            (if (r1 == ?1)
                (r4 += 2))
            (if (r2 == ?1)
                (r4 += 1))
            (branch r4
                    (write ,(if (zerop (logand n 1)) ?0 ?1))
                    (write ,(if (zerop (logand n 2)) ?0 ?1))
                    (write ,(if (zerop (logand n 4)) ?0 ?1))
                    (write ,(if (zerop (logand n 8)) ?0 ?1))
                    (write ,(if (zerop (logand n 16)) ?0 ?1))
                    (write ,(if (zerop (logand n 32)) ?0 ?1))
                    (write ,(if (zerop (logand n 64)) ?0 ?1))
                    (write ,(if (zerop (logand n 128)) ?0 ?1))))))
       (define-ccl-program ,rule-sym
         '(1
           ((r6 = ,n)
            (r0 = 0)
            (read r1)
            (read r2)
            (loop
             (call ,print-sym)
             (read r3)
             (r0 = r1)
             (r1 = r2)
             (r2 = r3)
             (repeat)))
           ((r0 = r1)
            (r1 = r2)
            (r2 = r5)
            (call ,print-sym)))))))

(define-ccl-automaton 30)
(define-ccl-automaton 90)
(define-ccl-automaton 110)

(defun ccl-sierpinski ()
  (with-output-to-string
    (let ((line "0000000001000000000"))
      (dotimes (_ 20)
        (princ line)
        (terpri)
        (setq line (ccl-run 'ccl-rule90 line))))))

The macro may look scary, but all it does is defining two CCL programs. What an elementary cellular automaton does is looking at the two cells around the current cell, map them to a cell depending to a rule and emit it. There are two edge cases with this for the first and last cell, in my implementation the first cell assumes the previous one was a zero and the last cell uses the first cell. Since there’s no random access, it’s stored into a spare register at the beginning and accessed in a S-Expression after the main loop terminated due to no more input. The surrounding and current cell are stored in three registers and rotated every time a new cell is read in. The mapping is done in the print program by summing up the ones and zeroes, then using the branch instruction to apply the rule to it. If you find this hard to follow, here’s an Emacs Lisp version of it using random access and less limited arithmetic to do the job:

(defun rule--evolve (prev cur next n)
  (let ((acc (+ (if (= prev ?1) 4 0)
                (if (= cur ?1) 2 0)
                (if (= next ?1) 1 0))))
    (if (zerop (logand n (ash 1 acc))) ?0 ?1)))

(defun rule-evolve (line n)
  (let ((out (make-string (length line) ?0)))
    (dotimes (i (length line))
      (cond
       ((zerop i)
        (aset out i (rule--evolve ?0 (aref line i) (aref line (1+ i)) n)))
       ((= i (1- (length line)))
        (aset out i (rule--evolve (aref line (1- i)) (aref line i) (aref line 0) n)))
       (t
        (aset out i (rule--evolve (aref line (1- i)) (aref line i) (aref line (1+ i)) n)))))
    out))

(defun sierpinski ()
  (with-output-to-string
    (let ((line "0000000001000000000"))
      (dotimes (_ 20)
        (princ line)
        (terpri)
        (setq line (rule-evolve line 90))))))

One more benchmark run, this time with less surprising performance numbers:

(benchmark 1000 '(ccl-sierpinski))
;; => "Elapsed time: 0.365031s (0.071827s in 1 GCs)"
(benchmark 1000 '(sierpinski))
;; => "Elapsed time: 0.545512s (0.071829s in 1 GCs)"

If you want to see it in action, try evaluating (progn (princ (sierpinski)) nil) in M-x ielm.

Final words

You may ask yourself now whether you should rewrite all of your slow code to CCL programs. I don’t believe that’s the way to go for a number of reasons:

  • Speedups vary wildly, somewhere between -40% to 450%. There’s no obvious way of predicting whether the optimization is worth it.
  • It’s hard to write an equivalent program or sometimes even impossible, especially if it requires you to use many variables or random access read/write operations.
  • It’s hard to debug a CCL program. While the compiler does a good job at catching syntax errors, runtime errors are far harder to figure out if you can only stare at the registers. Maybe a debugger could be built that uses the “continue” facility and instruction counter register…
  • It’s hard to maintain a CCL program. Not to mention, there’s hardly people who know how to write them. Most of the examples I’ve found online do text encoding/decoding. The only exception to this is pgg-parse-crc24-string which lives in a file marked as obsolete.
  • This leads me to my last point, CCL may be obsoleted as well. Granted, it will take time, but so far I haven’t seen people enthusiastic about it being a thing.

If you still believe that despite of this it’s worth giving CCL a try, please let me know, especially if you’re doing something non-standard with it, like advanced cryptography or number crunching. Likewise, if you’re not convinced that it’s turing-complete or that I could be doing some things far better than presented, send me a message.

-1:-- Code Conversion Language (Post Vasilij Schneidermann)--L0--C0--July 20, 2019 09:31 PM

(or emacs: Ivy 0.12.0 is out

Intro

Ivy is a completion method that's similar to Ido, but with emphasis on simplicity and customizability.

Overview

The current release constitutes of 398 commits and 6 months of progress since 0.11.0. Many issues ranging from #1904 to #2151 were fixed. The number of people who contributed code as grown to 148. Thanks, everyone!

Details on changes

Changelog.org has been a part of the repository since 0.6.0, you can get the details of the current and past changes:

Highlights

Many improvements are incremental and don't require any extra code to enable. I'll go over a few selected features that require a bit of information to make a good use of them.

New bindings

  • counsel-descbinds
    • M-o x execute action.
  • counsel-file-jump
    • M-o d dired.
  • counsel-find-file
    • M-o c copy file.
    • ` bookmarks: efficiently jump between recent directories.
    • $ directories stored in environment variables.
    • C-DEL go up directory. Customize: counsel-up-directory-level.
    • RET open file. Customize: counsel-find-file-extern-extensions.
    • // when on remote, cd to remote root.
    • / C-j select local root.
    • ~ when on remote, cd to remote home.
    • / C-j ~ cd to local home from remote.
  • counsel-git-log
    • M-o v open the current commit in magit.
  • counsel-rg
    • C-x C-d change the current directory for grep.
  • ivy-avy
    • C-v to scroll down.
    • M-v to scroll up.
  • ivy-read C-o
    • m mark and move down.
    • u unmark and move down.
    • DEL move up and unmark.
    • t toggle marks.
    • d perform the action on all marked elements.
  • ivy-switch-buffer
    • C-k kill buffer.
    • M-o x open buffer file externally.
  • ivy-reverse-i-search
    • C-k remove item from the history.

New Commands extensions

These commands are new variants and adaptations of existing commands.

Thing at point variants:

  • swiper-all-thing-at-point.
  • swiper-isearch-thing-at-point.
  • swiper-thing-at-point.

Search variants that go backwards:

  • swiper-backward.
  • counsel-grep-or-swiper-backward.
  • swiper-isearch-backward.

A variant of ivy-switch-buffer with live preview:

  • counsel-switch-buffer.
  • counsel-switch-buffer-other-window.

And finally:

  • counsel-dired - like counsel-find-file, but open dired.
  • swiper-isearch-toggle - toggle between swiper and isearch.

New Commands

I have put these separately so they don't get lost in the crowd. Be sure to try them out.

  • counsel-compile - completion for compile.
  • counsel-register - completion for registers.
  • counsel-minor - completion for minor modes.
  • swiper-isearch - a faster swiper that's not line-based.

Outro

Again, thanks to all the contributors. Happy hacking!

-1:-- Ivy 0.12.0 is out (Post)--L0--C0--July 19, 2019 10:00 PM

(or emacs: Ivy reverse-i-search

Introduction

I'm sure many are aware of the C-r functionality in bash (a whole lot of Emacs bindings are valid in bash and do the same thing). I also like the quirky Emacs-style prompt, that uses a backquote to open and a straight quote to close a your quoted input:

bash-reverse-i-search.png

So when you want to cd to somewhere where you were before you do C-r cd. And then the C-r "roulette" begins: you keep pressing C-r over and over, in the hopes to find what you were looking for.

Getting better history completion with Ivy

Ivy improves the "roulette" situation in two ways:

  • You get an overview of the matching candidates and their count,
  • You can quickly narrow down the candidates with fuzzy matching.

Here's the basic setup to enable C-r completion using ivy:

(define-key minibuffer-local-map
    (kbd "C-r") 'counsel-minibuffer-history)
(define-key shell-mode-map
    (kbd "C-r") 'counsel-shell-history)

The first key binding is also part of counsel-mode, while the second needs to be set up separately, after shell-mode was loaded.

And here's how counsel-shell-history looks like:

counsel-shell-history.png

The cool thing is that ivy-reverse-i-search applies to any Ivy command, not just for shell command completion. I find it especially useful for:

  • counsel-find-file
  • eval-expression

Recent improvement: delete history items

While searching with regexes is great, it's not so great when the old stuff that we won't ever need gets in the way. So now you can delete it with C-k. Since C-k also has to serve as ivy-kill-line, the history deleting behavior will only come into effect when the point is at the end of the line (so that ivy-kill-line does not make sense anyway).

ivy-reverse-i-search-kill.png

This way, your typos don't stay around to haunt you.

Outro

I hope you'll find ivy-reverse-i-search a useful boost to your completion needs. Happy hacking!

-1:-- Ivy reverse-i-search (Post)--L0--C0--July 08, 2019 10:00 PM

Alex Schroeder: Gopher Clients for Emacs

“Always two there are...” – Yoda

I’ve been using the Gopher app on my iOS devices for a while, now, as well as VF-1 on the command line. For Emacs, I was using gopher.el. But yesterday, @gcupc mentioned a second Gopher client for Emacs: Elpher. Indeed, there are at least two implementations of everything in Emacs! 😓

The number one thing I’ve thought when I gave Elpher a try was: there’s no browsing history! There’s no way to say “back”, or “next” (if reading a text file in a menu). When I read the History and Caching entry on GitHub, however, it seems that the up command does what I’d think a back command would do.

And something else I’ve noticed: there’s no TLS support. gophers://alexschroeder.ch:7443 can’t be accessed. I’ve added that to gopher.el, I might as well try and add it to Elpher, I guess?

UTF-8 works. 👍

Tags:

-1:-- Gopher Clients for Emacs (Post)--L0--C0--June 21, 2019 01:16 PM

Jonathan Bennett: Python and Emacs Pt. 1

No Description
-1:-- Python and Emacs Pt. 1 (Post)--L0--C0--June 20, 2019 12:00 AM

Jonathan Bennett: Emacs.org ~ May 2019

No Description
-1:-- Emacs.org ~ May 2019 (Post)--L0--C0--June 18, 2019 12:00 AM

William Denton: Mapping Acid Mothers Temple tours

I just posted Mapping Acid Mothers Temple tours with Org, R and Geonames, which shows how I use Org mode, R and Geonames to go from data printed on the back of a concert t-shirt, like this:

2016 concert shirt 2016 concert shirt

To a map of the band’s recent North American tours:

Map of North American tours Map of North American tours

Some people like RStudio or Jupyter notebooks to work through a problem, but for me it’s Org in Emacs.

-1:-- Mapping Acid Mothers Temple tours (Post William Denton)--L0--C0--June 06, 2019 03:55 PM

Saurabh Kukade: Why Emacs

Why I think Emacs is so cool - “Emacs outshines all other editing software in approximately the same way that the noonday sun does the stars. It is not just bigger and brighter; it simply makes everything else vanish.” -Neal Stephenson, “In the Beginning was the Command Line” Introduction This is my first blog post about Emacs. I...
-1:-- Why Emacs (Post)--L0--C0--June 04, 2019 07:00 AM

Nicolas Martyanoff: Reading RFC documents in Emacs

I am used to read RFC documents in Emacs by opening them as simple text files. But I was always annoyed by the impossibility to follow references to other documents. Additionally, searching for the right RFC usually involve going back to the web browser and Google while I would prefer staying in Emacs.

Since I enjoy Helm so much, I thought it would be easy to parse the index of all RFC documents, and create a browser allowing me to pick a RFC. And since I was writing some elisp, I ended up writing a major mode to highlight parts of RFC documents, including adding links when another RFC is mentioned.

So here is rfc-mode. It is quite simple for the time being, and I have a few ideas for more features, but it is already quite useful.

It is also available on MELPA.

RFC browser

RFC reader

-1:-- Reading RFC documents in Emacs (Post)--L0--C0--June 02, 2019 12:00 AM

Chris Wellons: UTF-8 String Indexing Strategies

This article was discussed on Hacker News.

When designing or, in some cases, implementing a programming language with built-in support for Unicode strings, an important decision must be made about how to represent or encode those strings in memory. Not all representations are equal, and there are trade-offs between different choices.

One issue to consider is that strings typically feature random access indexing of code points with a time complexity resembling constant time (O(1)). However, not all string representations actually support this well. Strings using variable length encoding, such as UTF-8 or UTF-16, have O(n) time complexity indexing, ignoring special cases (discussed below). The most obvious choice to achieve O(1) time complexity — an array of 32-bit values, as in UCS-4 — makes very inefficient use of memory, especially with typical strings.

Despite this, UTF-8 is still chosen in a number of programming languages, or at least in their implementations. In this article I’ll discuss three examples — Emacs Lisp, Julia, and Go — and how each takes a slightly different approach.

Emacs Lisp

Emacs Lisp has two different types of strings that generally can be used interchangeably: unibyte and multibyte. In fact, the difference between them is so subtle that I bet that most people writing Emacs Lisp don’t even realize there are two kinds of strings.

Emacs Lisp uses UTF-8 internally to encode all “multibyte” strings and buffers. To fully support arbitrary sequences of bytes in the files being edited, Emacs uses its own extension of Unicode to precisely and unambiguously represent raw bytes intermixed with text. Any arbitrary sequence of bytes can be decoded into Emacs’ internal representation, then losslessly re-encoded back into the exact same sequence of bytes.

Unibyte strings and buffers are really just byte-strings. In practice, they’re essentially ISO/IEC 8859-1, a.k.a. Latin-1. It’s a Unicode string where all code points are below 256. Emacs prefers the smallest and simplest string representation when possible, similar to CPython 3.3+.

(multibyte-string-p "hello")
;; => nil

(multibyte-string-p "π ≈ 3.14")
;; => t

Emacs Lisp strings are mutable, and therein lies the kicker: As soon as you insert a code point above 255, Emacs quietly converts the string to multibyte.

(defvar fish "fish")

(multibyte-string-p fish)
;; => nil

(setf (aref fish 2) ?ŝ
      (aref fish 3) ?o)

fish
;; => "fiŝo"

(multibyte-string-p fish)
;; => t

Constant time indexing into unibyte strings is straightforward, and Emacs does the obvious thing when indexing into unibyte strings. It helps that most strings in Emacs are probably unibyte, even when the user isn’t working in English.

Most buffers are multibyte, even if those buffers are generally just ASCII. Since Emacs uses gap buffers it generally doesn’t matter: Nearly all accesses are tightly clustered around the point, so O(n) indexing doesn’t often matter.

That leaves multibyte strings. Consider these idioms for iterating across a string in Emacs Lisp:

(dotimes (i (length string))
  (let ((c (aref string i)))
    ...))

(cl-loop for c being the elements of string
         ...)

The latter expands into essentially the same as the former: An incrementing index that uses aref to index to that code point. So is iterating over a multibyte string — a common operation — an O(n^2) operation?

The good news is that, at least in this case, no! It’s essentially just as efficient as iterating over a unibyte string. Before going over why, consider this little puzzle. Here’s a little string comparison function that compares two strings a code point at a time, returning their first difference:

(defun compare (string-a string-b)
  (cl-loop for a being the elements of string-a
           for b being the elements of string-b
           unless (eql a b)
           return (cons a b)))

Let’s examine benchmarks with some long strings (100,000 code points):

(benchmark-run
    (let ((a (make-string 100000 0))
          (b (make-string 100000 0)))
      (compare a b)))
;; => (0.012568031 0 0.0)

With using two, zeroed unibyte strings it takes 13ms. How about changing the last code point in one of them to 256, converting it to a multibyte string:

(benchmark-run
    (let ((a (make-string 100000 0))
          (b (make-string 100000 0)))
      (setf (aref a (1- (length a))) 256)
      (compare a b)))
;; => (0.012680513 0 0.0)

Same running time, so that multibyte string cost nothing more to iterate across. Let’s try making them both multibyte:

(benchmark-run
    (let ((a (make-string 100000 0))
          (b (make-string 100000 0)))
      (setf (aref a (1- (length a))) 256
            (aref b (1- (length b))) 256)
      (compare a b)))
;; => (2.327959762 0 0.0)

That took 2.3 seconds: about 2000x longer to run! Iterating over two multibyte strings concurrently seems to have broken an optimization. Can you reason about what’s happened?

To avoid the O(n) cost on this common indexing operating, Emacs keeps a “bookmark” for the last indexing location into a multibyte string. If the next access is nearby, it can starting looking from this bookmark, forwards or backwards. Like a gap buffer, this gives a big advantage to clustered accesses, including iteration.

However, this string bookmark is global, one per Emacs instance, not once per string. In the last benchmark, the two multibyte strings are constantly fighting over a single string bookmark, and indexing in comparison function is reduced to O(n^2) time complexity.

So, Emacs pretends it has constant time access into its UTF-8 text data, but it’s only faking it with some simple optimizations. This usually works out just fine.

Julia

Another approach is to not pretend at all, and to make this limitation of UTF-8 explicit in the interface. Julia took this approach, and it was one of my complaints about the language. I don’t think this is necessarily a bad choice, but I do still think it’s inappropriate considering Julia’s target audience (i.e. Matlab users).

Julia strings are explicitly byte strings containing valid UTF-8 data. All indexing occurs on bytes, which is trivially constant time, and always decodes the multibyte code point starting at that byte. But it is an error to index to a byte that doesn’t begin a code point. That error is also trivially checked in constant time.

s = "π"

s[1]
# => 'π'

s[2]
# ERROR: UnicodeError: invalid character index
#  in getindex at ./strings/basic.jl:37

Slices are still over bytes, but they “round up” to the end of the current code point:

s[1:1]
# => "π"

Iterating over a string requires helper functions which keep an internal “bookmark” so that each access is constant time:

for i in eachindex(string)
    c = string[i]
    # ...
end

So Julia doesn’t pretend, it makes the problem explicit.

Go

Go is very similar to Julia, but takes an even more explicit view of strings. All strings are byte strings and there are no restrictions on their contents. Conventionally strings contain UTF-8 encoded text, but this is not strictly required. There’s a unicode/utf8 package for working with strings containing UTF-8 data.

Beyond convention, the range clause also assumes the string contains UTF-8 data, and it’s not an error if it does not. Bytes not containing valid UTF-8 data appear as a REPLACEMENT CHARACTER (U+FFFD).

func main() {
    s := \xff"
    for _, r := range s {
        fmt.Printf("U+%04x\n", r)
    }
}

// U+03c0
// U+fffd

A further case of the language favoring UTF-8 is that casting a string to []rune decodes strings into code points, like UCS-4, again using REPLACEMENT CHARACTER:

func main() {
    s := \xff"
    r := []rune(s)
    fmt.Printf("U+%04x\n", r[0])
    fmt.Printf("U+%04x\n", r[1])
}

// U+03c0
// U+fffd

So, like Julia, there’s no pretending, and the programmer explicitly must consider the problem.

Preferences

All-in-all I probably prefer how Julia and Go are explicit with UTF-8’s limitations, rather than Emacs Lisp’s attempt to cover it up with an internal optimization. Since the abstraction is leaky, it may as well be made explicit.

-1:-- UTF-8 String Indexing Strategies (Post)--L0--C0--May 29, 2019 09:52 PM

Emacs Redux: Spell Checking Comments

I’m notorious for all the typos I make.1 Thankfully Emacs features an awesome built-in mode named flyspell to help poor typists like me. Flyspell highlights misspelled words as you type (a.k.a. on the fly) and has useful keybindings to quickly fix them.

Most people typically enable flyspell only for major modes derived from text-mode (e.g. markdown-mode, adoc-mode), but it can really help programmers as well by pointing out typos they make in comments. All you need to do is enable flyspell-prog-mode. I typically enable it for all programming modes2 like this:

(add-hook 'prog-mode-hook #'flyspell-prog-mode)

Now you’ll get instant feedback when you make some typo in a comment. To fix a word just press C-c $ (M-x flyspell-correct-word-before-point), while your cursor is behind it.3

flyspell_prog_mode.gif

That’s all I have for you today! Keep fixing those nasty typos!

  1. Especially in blog posts. 

  2. At least the well-behaved ones that derive from prog-mode that is. 

  3. There are many other ways to correct misspelled words with flyspell, but we’ll ignore them for now for the sake of simplicity. 

-1:-- Spell Checking Comments (Post Bozhidar Batsov)--L0--C0--May 24, 2019 07:07 AM