Nomad app getting I/O errors from deployment environment?

Anybody experienced anything like this?

2023-06-02T19:26:03Z app[4d5727b3] ord [info][82399.462095] blk_update_request: I/O error, dev vdb, sector 496768 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:03Z app[4d5727b3] ord [info][82399.490500] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:03Z app[4d5727b3] ord [info][82399.505954] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:03Z app[4d5727b3] ord [info][82399.519838] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:03Z app[4d5727b3] ord [info][82399.535254] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:04Z app[4d5727b3] ord [info][82399.572273] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:04Z app[4d5727b3] ord [info][82399.583517] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:04Z app[4d5727b3] ord [info][82399.593161] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:04Z app[4d5727b3] ord [info][82399.602559] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:04Z app[4d5727b3] ord [info][82399.612123] blk_update_request: I/O error, dev vdb, sector 329184 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
2023-06-02T19:26:04Z app[4d5727b3] ord [info][2023-06-02 19:26:04 +0000] [533] [ERROR] Socket error processing request.

I ran fly deploy and it seems to have staved off the issue. I’ll follow up if it happens again. I have to be honest this shakes my confidence of depending on fly.io for production software.

My first assumption when I see this is that it might be related to Nomad. We have run into issues before where it tries to mount a volume that is already in use, which leads to failed writes, or potentially even data corruption. This is extremely rare, but what you encountered matches the errors we got then.

I know you’ve probably seen us talk it up a lot on these forums, but migrating your app to V2 might help prevent this kind of error (if the stack you’re using is actually made for what you’re using it for, it works more reliably!)

Regardless, I’m happy it’s working now, although you might want to check any attached volumes for data integrity.

Looks like it’s happening again.

2023-06-03T22:27:31Z app[ec8f5fdc] ord [info]OSError: [Errno 5] I/O error: '/usr/local/lib/python3.10/site-packages/django_filters/templates/500.html'

I’ll consider migrating to v2. I was under the impression v1 certainly is a viable platform, just deprecated. Maybe I misunderstood the public announcement about v2.

Dmesg from the console.

[96401.787526] blk_update_request: I/O error, dev vdb, sector 8461672 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[96401.788856] EXT4-fs error (device vdb): __ext4_find_entry:1534: inode #270346: comm gunicorn: reading directory lblock 0
[96401.790676] EXT4-fs error (device vdb): __ext4_find_entry:1534: inode #270346: comm gunicorn: reading directory lblock 0
[97650.914486] print_req_error: 12 callbacks suppressed
[97650.914490] blk_update_request: I/O error, dev vdb, sector 4456680 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[97650.923958] blk_update_request: I/O error, dev vdb, sector 4456680 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[97654.480558] blk_update_request: I/O error, dev vdb, sector 74192 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[97654.481787] EXT4-fs error: 10 callbacks suppressed
[97654.481789] EXT4-fs error (device vdb): __ext4_find_entry:1534: inode #94: comm ash: reading directory lblock 0
[97654.490944] blk_update_request: I/O error, dev vdb, sector 28936 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[97654.492186] EXT4-fs error (device vdb): __ext4_get_inode_loc_noinmem:4446: inode #40961: block 3617: comm ash: unable to read itable block
[97654.501767] blk_update_request: I/O error, dev vdb, sector 74192 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[97654.503247] EXT4-fs error (device vdb): __ext4_find_entry:1534: inode #94: comm ash: reading directory lblock 0
[97654.512433] blk_update_request: I/O error, dev vdb, sector 28936 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[97654.513683] EXT4-fs error (device vdb): __ext4_get_inode_loc_noinmem:4446: inode #40961: block 3617: comm ash: unable to read itable block
[97654.523130] blk_update_request: I/O error, dev vdb, sector 74192 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[97654.524340] EXT4-fs error (device vdb): __ext4_find_entry:1534: inode #94: comm ash: reading directory lblock 0
[97654.533344] blk_update_request: I/O error, dev vdb, sector 28936 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
[97654.534489] EXT4-fs error (device vdb): __ext4_get_inode_loc_noinmem:4446: inode #40961: block 3617: comm ash: unable to read itable block

Can’t troubleshoot this on my own because I get I/O errors on standard tools in my container.

-ash: strace: I/O error
/app # find
-ash: find: I/O error

Might there be an issue with the host that’s running my container?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.