utf8.next()

Type Function
Library utf8.*
Return value Numbers or iterator
Revision Release 2024.3703
Keywords utf8, UTF-8, Unicode, string, next

Overview

Examines or iterates through a UTF-8 string, depending on usage:

for charpos, codepoint in utf8.next, "UTF8-string" do
    print( charpos, codepoint )
end

In all cases, this function returns a new character position (in bytes) and code point (number) at this position.

Syntax

utf8.next( s [, charpos [, offset]] )
s (required)

String. The string.

charpos (optional)

Number. The character position to start at.

offset (optional)

Number. The character offset.

Examples

Next Offset
local utf8 = require( "plugin.utf8" )

local testStr = "♡ 你好,世界 ♡"

print( utf8.next( testStr, 2 ) )  --> 3  161
Iterator
local utf8 = require( "plugin.utf8" )

local testStr = "♡ 你好,世界 ♡"

for charpos, codepoint in utf8.next, testStr do
    print( charpos, codepoint )
end

--> 1   9825
--> 4   32
--> 5   20320
--> 8   22909
--> 11  65292
--> 14  19990
--> 17  30028
--> 20  32
--> 21  9825