-
Notifications
You must be signed in to change notification settings - Fork 0
blog
Massive changes in the last two days. The original pas repo has become an organization pas-audio-server. We've begun splitting up the original repo into parts, as afforded by the shift to an organization. I made an effort to set up a pas-audio-server github io web site. Everyone claims it's easy. They must have access to a memo I didn't get.
Among the changes in the last few days:
- DACs are now found automatically with a best effort provided to figure out a nice human readable name. This made it easy to test the server locally on the Linux VM I'm typing in right now. The audio device is the MacBook's native audio.
- The curses client now has a help window (accessible via ^H).
- The curses client is now wide-character character. Wide characters now render as blanks. Woot!
- Found and fixed a lovely concurrency bug. It took a while to feel confident it existing but once repeatable it took only minutes to find. Changing the order of two statements provided the fix (a sem_post and an unlock). I love concurrent programming (seriously, I do).
- Command line options on both clients permit targeting servers on different machines - this should have worked from the start, of course. It was nice to verify it worked.
- A minimal server emulator allows a developer of clients (including web servers interposing themselves between web clients and the pas-core) to test without having to set up a running pas-core.
- Found and fixed a communications incompatibility between 32 and 64 bit machines. I forgot that size_t changes size on 64 bit machines.
All-in-all, among the most fun I've had were tracking down and fixing the cool bugs such as the concurrency bug. Give me MORE bugs! 👍
Lozord is making good progress on a web server to interpose itself between web-based clients and the pas server (pas-core). He's writing in go which is very cool in-and-of-itself! GO LOZORD!
Things we're thinking about for someday:
- Audio effects as audio packets are written to pulseaudio. An obvious one is volume (so it can be controlled remotely).
- A refactoring of audio sourcing to permit Internet Radio to work alongside ffmpeg.
- A refactoring of audio sinking to permit output of RTSP-based streams.
Implementing the last two along with what pas can already do would position pas as a contender for the most powerful audio server anywhere.
Lozord is making progress.
I've written a stub emulator of pas for use by web server and client writers.
*BUT - we are shifting over to a github organizational structure. The organization is:
This repository will be phased out.
Documentation
See the pas client API.
Cleaned up the commands.proto.
Documentation
See the pas client API.
Number 4 below was fun to work on. I realized that a first pass through the file system would catalog file sizes. Only those with the same file size be checksummed as checksumming takes a long time.
Further, deduplication isn't really the job of a media server - it would be nice but can be deferred to "some day".
On to documentation.
What to add next? Choices:
- The existing clients do not know about namespaces. Related, there are no new namespace aware proto3's.
- There isn't a document describing the client API. I bet lozord would appreciate this. Maybe this is the right next thing to do.
- File system loop checking could be made more robust. To do correctly, this is a fair bit of work.
- Checking to see if there are duplicates between namespaces. It would be reasonable to compare checksums. This would be fun to do in a test harness to see how many duplicates I already have. The documentation is the next right thing to do, but this one is more fun. I'll get to the documentation...
New file system handling is done and it's killer.
Actually, it's 1979 Unix, but that was killer. So, by the law of transitivity, this new code is killer too.
Major reorganization of exception handling to make the product more "production ready." Time will tell. Or really, the uncaught exceptions will tell.
The code to populate the database will be kept as a separate program. The new code is done (with two exceptions).
-
Namespaces must be added to clients / commands.
-
There is an uncaught exception waiting in fsmain.cpp. Setting the schema is unguarded.
One more change - implemented htonl and ntohl on binary lengths.
Spent WAAAAYYY to much time today redoing how the file system is dealt with. I spent hours chasing my tail trying to use atomic increments from c++ 11. I spent way too long attributing its inconsistent results to pointer problems elsewhere in my code.
The first (incomplete) pass is in the can. Paths and tracks are now separated (as they should be). All paths are stored in the way that Unix directories are stored - every component directory has an id (me) and the id of its parent (up). The root has an up of -1.
All tracks possess only their own file name and the id of the directory it lives in. This is as it should be. Full paths are easy to derive every once in awhile (when a track is launched on the player). No biggie.
What needs to happen next is that a notion of "namespaces" must be added so that there can be multiple root directories all living in the same database.
Also, the code checked in today does not call ffprobe so that details about each track are not gathered yet.
All of the database connectivity has been made multithreaded. mysql is easy this regard. Before an OMP pragma I set the number of threads to a known number, then open that many connections in an array. Inside the threaded for loop, the thread ID is used as the index into the connections array. mysql is happy. I'm happy.
Working way too hard on this project but I just cannot stop. I need to stop, but cannot. I'm not in pain any longer (any where near what I was). The pain just sapped me of will to do my normal work. I have to get back into the swing of all things. And the world is getting scarier day by day due to the fascist Cheeto in the White House. Coding non-stop seems to be an escape from the sheer awfulness the state of our nation is in - and the world.
Attempted to get pas running on an x86 Ubuntu guest VM. The software worked great but Linux audio did not. Spent hours on this only to find Linux was the problem. The good news is though that pas is now newly tolerant of individual DAC failures. For example, suppose DAC 1 out of 0, 1 and 2 fails. DACs 0 and 2 will not be impacted and accessing DAC 1 will be properly ignored.
Instructions were added to the curses GUI. ESC quits in an orderly fashion. The GUI is easier to configure for the future as all size measurements have been made symbolic. Verified there's nothing wrong with the short titles - they really are short that way in the DB.
Tested multiple hours of queued music on three DACs at once.
Added support for clearing play queues to both the connection manager and the curses client. The code to do it was already in audio_component. Added support for appending one DACs queue onto the end of another but there is no way to test this in the curses client. The command line client will have to be modded.
A big one was that the main player loop has been refactored for clarity as well as to fix a couple of bugs.
- The timecode didn't reset when the NEXT queued song was invoked.
- What it means to be awakened by play or next was decided and implemented.
Lozord and I gave thought to how an adaptable logging scheme can be implemented. I found the Apache logger online but holy cow, that's too complicated. I figure I'll just add logging levels to the existing logging class.
A potentially harmful bug during tear-down was found and fixed. It was causing pulseaudio to get hung on getting one of its own mutexes. It was something I was doing in the audio_component destructor.
I am very pleased.
Nearly done. Notice there are three DACs running concurrently now.
Started a curses based pas client. Because? 'Murica.
Integration of protocol buffers is complete. Slowed down in part by a lack of accessible documentation from Google. Much time spent on bug which turned out to be data length dependent over the wire. recv() flag of MSG_WAITALL cured what ailed me.
I have ordered a third DAC (from Fiio). Note that the Fiio X5 crashed Linux. Let's hope for better results. Also, don't tell my wife that I ordered it.
I can now build, transmit, receive and decode the messages described below.
The last few days have been spent learning how to use protocol buffers (from Google) and converting the pas server and command line client to use them. All is going well and is almost finished. It is nice to know that the pas server itself is nearing completion for a beta 1 as the functionality is mostly there and with the conversion to proto buffers, its communication to a front end server is almost there too.
Proto buffers are indeed making things easier that using XML or json.
For example:
29 // This is a return type 30 message Row { 31 Type type = 1; 32 map <string, string> results = 2; 33 } 34 35 // This is a return type 36 message SelectResult 37 { 38 Type type = 1; 39 repeated Row row = 2; 40 }
This specifies the message structure for returning rows from a database. SelectResult contains an array of Row. Row contains a map of column names to column contents. Accessing these data are just about identical to accessing the same native C++ structures.
Protobuffers are understood now and working in a test harness. Quite cool. Way easier than json or xml.
Good one. If ffmpeg up-samples (say from 22500 on a crap mp3 to 44100 Khz), it introduces a 0.8 second pssst. So, I skipped the first three buffers. No more pssst. Great right?
A few days later I play a flac and ... the entire track turns to pssst. Uggh.
Now, the player thread checks the file extension - if mp3 it skips. Something to watch out for though, there may be other low resolution file formats besides mp3.
Protobuffers are almost implemented in a branch. These will replace the ad hoc parsing / exchanging of command between the UI and pas. LOZORD is designing the protos. I'm implementing the prototype in the command line client and pas server.
Looking good.
I was going to start on rewriting the file system handling but...
I decided to revamp the logging capability to make it more useful in a production environment (no more output to the console). Since logging was so liberally emplaced this was a very long process. I introduced two bugs which took a long time to find - both were stupid typos of one character length. Rookie mistakes, we all make them. But, I used the opportunity to expand the debugging power in the logger and its use.
We'll need something like cd, pwd and ls. Makes sense to model the database after the Unix file system itself.
table dirs + dir_id + parent_id + name table tracks + id + dir_id + stuff
I knew this was coming.
LOZORD suggests using Google's Proto Buffers rather than json to transfer data and commands. Sounds good to me.
Next and Clear are done.
Here are the defined commands now:
ac artist count tc track count se c p select on column c with pattern p sq server quit n P **QUEUE** on DAC n n Z pause DAC n n S stop DAC n n R resume DAC n n who artist on DAC n n what title on DAC n n ti timecode on DAC n n next skip to next queued track on DAC n n clear clear the queue for DAC n
OK - refactoring of how tracks are played is complete. Drawing off a play queue is done. Waking when queue is empty is done. Still to come:
- Next
- Clear
Let's hash through some ideas.
I want to:
- be able to queue up many tracks at once "n queue id"
- advance to next "n P" - if already playing
- clear the queue "n clear"
- I can already pause "n Z"
- I can already resume "n R"
There is a command queue (semaphore and lock).
Change implementation of "n P" so that it doesn't look at its argument but instead draws from a play queue.
Protect the play queue with lock only.
If, when adding to play queue, you find it empty, issue "n P".
Seems like it will work.
BTW the blocking , non-blocking point below came up because I am using pas for all my tunes today. Feeling proud.
Felt my way through json parsing. There is little on how memory is reclaimed other than json_object_put(). Do I put() for every json_object even those that come as part of larger ones? Or, do I do just one put() for the larger object? Sorry folks, unless you are very disciplined - doxygen is not enough.
I understand json-c is reference counted so I'm going to shut my eyes for now, and hope it works out. I am not happy about knowingly allowing a potential memory leak to exist.
TODO I can answer this question for myself, which I will have to put off, by writing a test harness and watching its memory footprint change.
Now that player thread commands are all asynchronous, I realize I have to have something that allows me to queue up blocking requests. That is:
I have handled stopping a current track and playing a new track if the thread receives another play command. But what if I want to play a playlist autonomously. At the very least I need a callback mechanism where the player thread can notify a listener that the current track is done.
Without the ability to have blocking play requests, someone is going to have to poll. That would be a *bad idea.
Feeling my way through json parsing.
Implemented who and what. Regularized the command format.
Here are the commands implemented so far.
ac artist count tc track count se c p select on column c with pattern p sq server quit n P play on DAC n n Z pause DAC n n S stop DAC n n R resume DAC n n who artist on DAC n n what title on DAC n n ti timecode on DAC n
If you profess your grooviness by contributing to open source, you've contributed nothing at all if your work is ill-documented, shoddy or arcane. All you've done is masturbated with a keyboard for your own vanity and feelz.
Pretty exhausted. I decided to serialize data from server to client in json. I know nothing about json so had to learn it. Search and serialization is almost complete. On the other end I have no idea how to parse. I'll have to learn that too.
Taking a walk to exercise my back. See if I drop to the cold hard ground in pain.
Revisited the clean termination issue - needed to add setsockopts to work-around TIME_WAIT. I should have though of this sooner. The error message implied network, not audio. I was guilty of not interpreting all the information at my disposal.
You can specify an output sample rate and if the track differs ffmpeg with resample. BUT... it introduces noise when starting up. Tsst. There is an option to seek in ffmpeg (-ss). It did not solve my problem. I solved this by skipping the first three buffers work of samples. I am using 24KB again so this means I skip the first 0.84 seconds of every track. This is not great. Maybe I can get rid of this for non-resamples tracks by logging sample rate in the database.
The offending mp3s have a sample rate of 22500!!!
Now I'll get back to mysql / client features.
Took a detour to get the player thread to cleanly shut down pulseaudio.
And, finally found a track recorded at a different sample rate meaning that I have to deal with that issue now. I figured it was coming and was surprised I hadn't been bitten by it yet.
Resume and pause now work. Implementation was a perfect case for a goto
.
As implemented, once entering the paused state any player thread command will awaken the thread. The newest command will be drained from the buffer and executed. So, when paused, even a play something else command will do the right thing.
The new command line client is so much easier to use and code.
Next up: reimplementing things like a general mysql query in the new client.
I am confident in the non-blocking command interface (commands such as play, pause, etc.).
I've retired the first version of the command line client as it was written sloppily. The version has a bit more thought in it. It's working great.
This has been (all this actually) has largely been written in bed as I am suffering some excruciating back pain. Note to you youngin's - never stop being physically active. Forty plus years of intense coding without breaks has broken me. Don't let it happen to you. That 30 hour marathon? It will cost you later in life.
I've added code to trip looking for a command inside the main loop of the player thread. This code path exercises the NON blocking access of the command queue. It seems to work as the 'Q' (terminate the thread) and 's' (stop the current track) commands work when sent from the client.
Trying a doubling of the current buffer size to (1 << 13) * 6. Recall two samples (stereo) of 24 bit audio is 6 bytes. Pulse will fail if it is asked to write something that isn't an even multiple of 6 (in the 24 bit per sample mode).
The current Player Thread architecture combines audio decoding and playing. This has consequences that are not nice.
- It means that I cannot easily add Internet Radio, which apparently is a thing.
- It means I cannot rewind or fast forward. This may not be an issue as you cannot do this with Internet Radio (easily) either. Apparently, Internet Radio is a thing.
Added timecode during breakfast. This took longer than expected as I saw various opportunities to refactor copy / pasted code.
Note to youngsters: write in one place, test in one place, fix in one place.
Why is flushing cin so hard?
Welcome aboard LOZORD as a collaborator. LOZORD knows all those new fangled frameworks and dreamworks and homeworks and things that I am too old to figure out.
This merits its own entry. I can now use the client to remotely launch multiple songs simultaneously in pas. DAC 1 is playing, right now, Frankie Goes To Hollywood. DAC 0 is playing, right now, Pink Floyd. Both DACs are driven from files on the NAS (one mp3, the other flac). No runs, drips or errors.
I'm thinking I like what appears to be an unintended feature. The thread that will ride over the audio hardware is being passed information about all of the DACs in the system. I was just pondering how I'm going to assign some ownership of one DAC or another to some client somewhere in the wide world web. It occurs to me, as I eat my breakfast, that this is wrong. DACs want to be free! I'm thinking maybe the choice of DAC will be made exactly at the point of launching a track. At this moment, any client ought to be able to play on any DAC. If some sort of policy needs to be imposed, I can worry about that at a later date and at a higher level.
Update: This morning's coding lays formal groundwork for clients to cause changes in the thread managing the actual hardware. A client can cause a command to be added in a thread safe way to the player's thread command queue. Such commands are p, r, s and z for play, resume, stop and pause (catch some z's). Synch is accomplished via a semaphore and lock in a one-sided produce/consumer with infinite resources.
It has been a LOT of work relatively speaking, to integrate the player. But then, this is where the potatoes go on the fork isn't it. Thank you VC Ned Hazen for such a lovely metafore.
Update: This afternoon has been productive in many ways. With regard to pas, I can now send player hardware specific commands such as play, pause, etc. from a remote client to any of the DACs. The multi-threaded command passing seems to be working very well. Remember, commands passed from the connection manager to the player thread must be controlled carefully. Review: There is a std::queue guarded by an std::mutex. Then, to wake an idle player thread, a pthread_semaphore is used. If the player was NOT idle, the sem_wait will return immediately because a sem_post is done while the queue is locked and a command is added. Simple.
But now I am typing vi commands everywhere. Even herex:wq
Next - tonight maybe - the audio hardware player and then a GUI could be started!
Update 👍 Unusual update before the main article... I just received a second DAC and has just tested multiple concurrent hiccup free streams coming a little bitty ARM dev-board. Both files currently playing are flacs.
Actual first entry for this day: Looked at some high end audio servers in the multi-thousand dollar range. They don't support multiple streams. Why are they worth the money - just for the DACs? Spend a few hundred on DACs for pas and spend one tenth the money and get multiple concurrent streams. WTF?
Rewrote the media discovery code this morning. Now, it will build a vector of all folders. The invoke OpenMP on each folder. Each thread adds all the permitted files in itself to the database.
There is something strange with the last few tracks - I noticed this with the previous version as well. The last few get amazingly slow. Eight threads are present in pas so OpenMP hasn't collected the zombies. There is only one ffprobe running and it's super slow. The time between ffprobes is amazingly long. I have reviewed the DB code and am closing connections properly per folder. Threading is up to OpenMP. The only ideas I have are:
- Somehow there is some processor affinity taking place sending with the little processors lagging behind (why would only one run?).
- The NAS is slowing way down. Why would only one run?
- I'm running out of process table slots and am waiting for available ones. Don't think so.
- I am running out of DB connections, but I'm freeing them correctly.
- Here's an idea I just thought of.... The time it takes to take down a TCP connection could be several seconds. This is in the database connector though. Why wouldn't they take care of this? And why would only one run?
- There's something wrong with the files in question. Unlikely and why would only one run?
I am going to put this aside for now as rebuilding the database isn't something that's a high priority at the moment. I want the audio code integrated so I can play tracks!
Here's snippet of output from the terminal-based client. I am searching for tracks who title begins with "End". Currently there is a SQL Injection vulnerability in the Pattern:
Command: se
Column: title
Pattern (no spaces): End%
4108 Doors End of the Night The Doors [1967] 9188 The Doors End of the Night The Doors Pop/Rock
6190 Vangelis End Titles from "Bladerunner" Themes
459 André Previn End Titles* (Instrumental) My Fair Lady Soundtrack Soundtracks
10463 André Previn End Titles* (Instrumental) My Fair Lady Soundtrack Soundtracks
8982 The Beach Boys Endless Harmony (from 'Keepin Endless Harmony Soundtrack Other
Command:
Update: Stubs for the audio component have been added.
The network manager is already multi-threaded accepting connections and farming them out to their own thread.
This thread is the logical place to spin off the audio playing thread. The network connection thread will loop over messages coming from an external client. It will service those that it can, directly. It launches the audio managing thread and exchanges data with it via shared memory guarded by mutexes. That's my current thinking, anyway.
Update: Much work accomplished - zeroing in on the immediately preceding idea. Why the hell does anyone watch those stupid housewives shows? Wife loves them. I love her. Transitivity does not hold in the algebra of our marriage in this matter.
Note to Lozord: I thought this was going to be my summer project. It's almost ready for a GUI!
Today the server spoke to the simple text-based client in a two way conversion for the first time. The client asked for the current track count and the server replied with the correct number. More to come later today, I expect. BTW - the number was 10917.
Update: Holy shit slqite from C sucks. I just checked out the MySQL C++ connector. Only took me seconds to realize it's so much easier to use it justifies refactoring the DB component. That's what I get for assuming other people who think highly of sqlite...
Update: Replaced sqlite with mysql c++. Ahhhhhh. Doing so reduced the amount of code in db_component by two thirds and has increased concurrency. I recall sqlite from C# to be nice but from C it blows chunks, as a friend from grad school used to say. Now he's a TV network infrastructure guy or something.
Picked up Navicat for the Mac, version 11. I used version 7 on Windows. It still sucks.
Update: mysql c++ comparatively rocks. A primitive pattern-based search on any field in a track is running between text-based client and server.
Tomorrow, put audio into the server and play some music on command!
I am extracting tags from media using ffprobe via pipe. The tags are going into a tag -> value map which will be persisted in the database.
I am choosing to defer crafting a strategy to economize on database size and instead will stuff everything from every track into the tracks table. I know full well that refactoring database connectivity can be a lot of work but:
- It is possible that database size and design won't in the end be worth optimizing till hundreds of thousands of tracks inhabit the database.
- This type of optimization isn't interesting at this time.
Next up:
Merging the tag extraction prototype into pas. Now... cleaning my office and taking out the recyclables.
Update: Office got cleaned. Recyclables still here. :(
Tag extraction and database insert is complete. I am using prepared statements for robustness so as to prevent the equivalent of a SQL injection via file names. Performance is not wonderful. sqlite reports busy a lot. I am not sure where the bottleneck is.
It isn't:
- The NAS - it barely breaks a sweat.
- The network - same.
It could be:
- sqlite poor locking strategies.
- Poor I/O performance (but storage is on eMMC).
- Use of prepared statements.
It could also be I am expecting too much.
I am extracting tag metadata and writing to sqlite at about 20 flac or mp3 files per second over the network. Actually, this isn't bad. I AM surprised at what appears to be lock contention in sqlite. Must be table level locking 👎.
Update: The way I'm spinning off media discovery tasks was a bad idea.
Using this breadth first approach, as the number of subtrees dwindle, the number of threads dwindle until only one or two remain exploring the deepest subtrees. I will have to rewrite this to have all threads pull from a traditional "work list."
Next task is to rewrite the threading for media discovery to be tradition "work list."
Then, it's time to circle back to the networking code to wire up an actual "client manager" that will hear input from a user.
Update from 4/4/17 I did rewrite the discovery code to be work-unit oriented. The last few tracks still go ponderously slowly without any apparent explanation (as yet).
No April Fool's joke - double buffered audio is possibly done (possible). At this moment, the test code is playing a flac version of The Doors "The End" in the background from over my network to my NAS. No gaps or other underruns noticed yet.
Next up is to shift back to the database code. I'm not yet populating the database with tracks after enumerating them. Enumeration will be asynchronous so that I might be able to see the number of tracks increase as media is discovered. This ought to be the first usage of the command line test client.
The next step after that is likely to be adding some dummy test play commands - perhaps sqlite has a random row select.... SO says:
SELECT * FROM table ORDER BY RANDOM() LIMIT X
should work.
Update: Just ran 6 instances of pas at once on different flac's from the NAS. All of them rendered without hiccup. Very promising.
Update: Found and fixed two bugs. First allowed the back buffer to be overwritten the first time through the play loop. Second, I did not account for the potential for data to arrive in non-multiple of six bytes (24 bit samples * 2 channels). Reading fewer bytes at a time allows the conversion to stay far ahead so all reads return the exact number of bytes requested.
Think I've done enough on this today. Finger tips hurt. With a bad back, all of this code is being written laying flat. Normal finger pad typing is difficult and most keys are struck with only one hand. Getting old sucks.
After 2 or 3 part-time days of development, I am pleased with progress. There exists:
- a barebones command line client that will further testing.
- a barebones server on the other end to receive text-based commands.
- a multithreaded media discovery module.
- a test harness for decoding dozens of audio formats and playing audio through pulse.
Tomorrow I'll add my own threaded double buffering to bury latency induced by reading non-local files.
pas is Copyright © 2017 by Perry Kivolowitz - see license