Optional Skip

About

CSV๋ฅผ ์ฝ์–ด์„œ line, cell ๋ณ„๋กœ ๋ถ„๋ฆฌํ•ด์„œ ์ถœ๋ ฅํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ์„ ๋งŒ๋“ค๊ณ  ์žˆ๋‹ค๊ฐ€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ƒํ™ฉ์„ ๋งˆ์ฃผํ•˜์˜€๋‹ค.

๋‹ค์Œ ์ฝ”๋“œ๋Š” CSV ๋ฌธ์ž์—ด ์ŠคํŠธ๋ฆผ์„ ์ฝ์–ด์„œ ์ถœ๋ ฅํ•˜๋Š” ์ฝ”๋“œ์ธ๋ฐ, parse_csv_document๋Š” ๋‘ ๋ฒˆ์งธ ์ธ์ž๋กœ header bool ๊ฐ’์„ ๋ฐ›์•„ ์„ ํƒ์ ์œผ๋กœ CSV ํ…Œ์ด๋ธ” ํ—ค๋”๋ฅผ ์Šคํ‚ตํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ๊ฐ–๊ณ  ์žˆ๋‹ค.

use std::io::{BufRead, Result};

fn parse_csv_document(src: impl BufRead, header: bool) -> Result<Vec<Vec<String>>> {
    let mut lines = src.lines();

    if !header {
        // ?
    }

    lines
        .map(|line| {
            println!("line: {:?}", line);
            line.map(|line| {
                line.split(",")
                    .map(|entry| String::from(entry.trim()))
                    .collect()
            })
        })
        .collect()
}

const CSV_STRING: &str = "KO-KR,KO-KR Alphabet,EN Alphabet,
๊ธฐ์—ญ,ใ„ฑ,a
๋‹ˆ์€,ใ„ด,b
๋””๊ทฟ,ใ„ท,c
";

fn main() {
    let res = parse_csv_document(CSV_STRING.as_bytes(), false);

    println!("{:?}", res);
}

์ด๋Ÿฐ ์ƒํ™ฉ์—์„œ if header ์กฐ๊ฑด ์•ˆ์— ์–ด๋–ค ์ฝ”๋“œ๋ฅผ ๋„ฃ์–ด์•ผ ํ• ๊นŒ?

1. lines.next()

๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์ด๋‹ค. next()๋Š” mutableํ•˜๊ฒŒ iterable๋ฅผ ๋ฐ›์•„์„œ(&mut) ๋‹ค์Œ line์œผ๋กœ ๋„˜์–ด๊ฐ€๋„๋ก ํ•˜๋Š” ๋ฉ”์†Œ๋“œ์ด๋‹ค.

if !header {
    lines.next();
}

ํ•˜์ง€๋งŒ ํ•œ ๋ฒˆ๋งŒ ์‹คํ–‰๋œ๋‹ค๋Š” ๊ฒƒ์ด ๋งˆ์Œ์— ๋“ค์ง€๋Š” ์•Š์•˜๋‹ค.

2. lines.skip(N)

์‚ฌ์‹ค ์ด๊ฒŒ ๋  ๊ฑฐ๋ผ๊ณ  ์ƒ๊ฐํ–ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ์‚ฌ์‹ค skip()์€ ๊ธฐ์กด์˜ lines๋ฅผ mutate ํ•˜์ง€ ์•Š๊ณ , N๋งŒํผ ์Šคํ‚ตํ•œ ์ƒˆ๋กœ์šด iterator๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

๊ทธ๋ฆฌ๊ณ  move๊ฐ€ ์ผ์–ด๋‚˜๊ธฐ๋„ ํ•ด์„œ ํ•ด๋‹น ์ฝ”๋“œ ์ดํ›„๋กœ lines๋Š” ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค.

if !header {
  // note: `skip` takes ownership of the receiver `self`, which moves `lines`
}

์ฐธ๊ณ ๋กœ ์ด๊ฒƒ๋„ ์•ˆ๋œ๋‹ค. ํƒ€์ž… ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

let mut given_lines = src.lines();

let lines;
if header {
    lines = given_lines;
} else {
    lines = given_lines.skip(1);
    // Type mismatch [E0308] expected `Lines<impl BufRead+Sized>`,
    // but found `Skip<Lines<impl BufRead+Sized>>`
}

๋ฌผ๋ก  ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๊ฐ€์žฅ ์‰ฌ์šด ๋ฐฉ๋ฒ•์€ ๋ช…๋ฐฑํžˆ next()์ด๋‹ค. ํ•œ ์ค„๋งŒ ์Šคํ‚ตํ•˜๋ฉด ๋˜๋‹ˆ๊นŒ. ํ•˜์ง€๋งŒ ์—ฌ๋Ÿฌ ๊ฐœ๋ฅผ ์Šคํ‚ตํ•  ๋• ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ๊นŒ?

3. advance_by()

advance_by()๋Š” ํ˜„์žฌ ์ƒํ™ฉ์— ๊ฐ€์žฅ ๋งž๋Š” ๋ฉ”์†Œ๋“œ์ด๋‹ค.

๊ธฐ์กด์˜ iterator๋ฅผ ์ˆ˜์ •ํ•˜๊ณ  move๊ฐ€ ์ผ์–ด๋‚˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

lines.advance_by(3);

๊ทผ๋ฐ ๋‹จ์ ์€ ์ž‘์„ฑ ์‹œ์ ์ธ 2023.9 ๊ธฐ์ค€์œผ๋กœ ์•„์ง Nightly์—๋งŒ ์กด์žฌํ•œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

๋”ฐ๋ผ์„œ ์ง์ ‘ ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค์–ด์„œ ์ง„ํ–‰์ด ํ•„์š”ํ•˜๋‹ค.

fn advance_by(iter: &mut impl Iterator, n: usize) -> Result<()> {
    for _ in 0..n {
        iter.next();
    }

    Ok(())
}

// ...
fn parse_csv_document(src: impl BufRead, header: bool) -> Result<Vec<Vec<String>>> {
    // ...
    if !header {
        advance_by(&mut lines, 1)?;
    }
    // ...
}

๋‹จ์ˆœํ•˜์ง€๋งŒ loop์„ ๋Œ๋ ค next()๋ฅผ ์‹คํ–‰ํ•ด์ฃผ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

์ด๋ ‡๊ฒŒ ํ•˜๋ฉด next()๋ฅผ ๋™์ ์œผ๋กœ ์ฃผ์–ด์ง„ ํšŸ์ˆ˜๋งŒํผ ์‹คํ–‰ํ•˜์—ฌ skip(N)์„ ๋”ฐ๋ผํ•  ์ˆ˜ ์žˆ๋‹ค.

๋ฌผ๋ก  immutableํ•˜๊ฒŒ ์‹คํ–‰ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด skip๊ณผ map์„ ์ž˜ ์กฐํ•ฉํ•ด์ฃผ๋ฉด ๋˜๊ฒ ๋‹ค.

Referernce

Last updated

Was this helpful?